Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Tabs versus Spaces in Source Code

2 views
Skip to first unread message

Xah Lee

unread,
May 14, 2006, 10:04:40 PM5/14/06
to
Tabs versus Spaces in Source Code

Xah Lee, 2006-05-13

In coding a computer program, there's often the choices of tabs or
spaces for code indentation. There is a large amount of confusion about
which is better. It has become what's known as “religious war” —
a heated fight over trivia. In this essay, i like to explain what is
the situation behind it, and which is proper.

Simply put, tabs is proper, and spaces are improper. Why? This may seem
ridiculously simple given the de facto ball of confusion: the semantics
of tabs is what indenting is about, while, using spaces to align code
is a hack.

Now, tech geekers may object this simple conclusion because they itch
to drivel about different editors and so on. The alleged problem
created by tabs as seen by the industry coders are caused by two
things: (1) tech geeker's sloppiness and lack of critical thinking
which lead them to not understanding the semantic purposes of tab and
space characters. (2) Due to the first reason, they have created and
propagated a massive none-understanding and mis-use, to the degree that
many tools (e.g. vi) does not deal with tabs well and using spaces to
align code has become widely practiced, so that in the end spaces seem
to be actually better by popularity and seeming simplicity.

In short, this is a phenomenon of misunderstanding begetting a snowball
of misunderstanding, such that it created a cultural milieu to embrace
this malpractice and kick what is true or proper. Situations like this
happens a lot in unix. For one non-unix example, is the file name's
suffix known as “extension”, where the code of file's type became
part of the file name. (e.g. “.txt”, “.html”, “.jpg”).
Another well-known example is HTML practices in the industry, where
badly designed tags from corporation's competitive greed, and stupid
coding and misunderstanding by coders and their tools are so
wide-spread such that they force the correct way to the side by the
eventual standardization caused by sheer quantity of inproper but set
practice.

Now, tech geekers may still object, that using tabs requires the
editors to set their positions, and plain files don't carry that
information. This is a good question, and the solution is to advance
the sciences such that your source code in some way embed such
information. This would be progress. However, this is never thought of
because the “unix philosophies” already conditioned people to hack
and be shallow. In this case, many will simply use the character
intended to separate words for the purpose of indentation or alignment,
and spread the practice with militant drivels.

Now, given the already messed up situation of the tabs vs spaces by the
unixers and unix brain-washing of the coders in the industry... Which
should we use today? I do not have a good proposition, other than just
use whichever that works for you but put more critical thinking into
things to prevent mishaps like this.

Tabs vs Spaces can be thought of as parameters vs hard-coded values, or
HTML vs ascii format, or XML/CSS vs HTML 4, or structural vs visual, or
semantic vs format. In these, it is always easy to convert from the
former to the latter, but near impossible from the latter to the
former. And, that is because the former encodes information that is
lost in the latter. If we look at the issue of tabs vs spaces, indeed,
it is easy to convert tabs to spaces in a source code, but more
difficult to convert from spaces to tabs. Because, tabs as indentation
actually contains the semantic information about indentation. With
spaces, this critical information is lost in space.

This issue is intimately related to another issue in source code:
soft-wrapped lines versus physical, hard-wrapped lines by EOL (end of
line character). This issue has far more consequences than tabs vs
spaces, and the unixer's unthinking has made far-reaching damages in
the computing industry. Due to unix's EOL ways of thinking, it has
created languages based on EOL (just about ALL languages except the
Lisp family and Mathematica) and tools based on EOL (cvs, diff, grep,
and basically every tool in unix), thoughts based on EOL (software
value estimation by counting EOL, hard-coded email quoting system by
“>” prefix, and silent line-truncations in many unix tools), such
that any progress or development towards a “algorithmic code unit”
concept or language syntaxes are suppressed. I have not written a full
account on this issue, but i've touched it in this essay: “The Harm
of hard-wrapping Lines”, at
http://xahlee.org/UnixResource_dir/writ/hard-wrap.html
----
This post is archived at:
http://xahlee.org/UnixResource_dir/writ/tabs_vs_spaces.html

Xah
x...@xahlee.org
http://xahlee.org/

Eli Gottlieb

unread,
May 14, 2006, 10:44:54 PM5/14/06
to
Actually, spaces are better for indenting code. The exact amount of
space taken up by one space character will always (or at least tend to
be) the same, while every combination of keyboard driver, operating
system, text editor, content/file format, and character encoding all
change precisely what the tab key does.

There's no use in typing "tab" for indentation when my text editor will
simply convert it to three spaces, or worse, autoindent and mix tabs
with spaces so that I have no idea how many actual whitespace characters
of what kinds are really taking up all that whitespace. I admit it
doesn't usually matter, but then you go back to try and make your code
prettier and find yourself asking "WTF?"

Undoubtedly adding the second spark to the holy war,
Eli

--
The science of economics is the cleverest proof of free will yet
constructed.

Edward Elliott

unread,
May 14, 2006, 11:28:29 PM5/14/06
to
Eli Gottlieb wrote:

> Actually, spaces are better for indenting code. The exact amount of
> space taken up by one space character will always (or at least tend to
> be) the same, while every combination of keyboard driver, operating
> system, text editor, content/file format, and character encoding all
> change precisely what the tab key does.

What you see as tabs' weakness is their strength. They encode '1 level of
indentation', not a fixed width. Of course tabs are rendered differently
by different editors -- that's the point. If you like indentation to be 2
or 3 or 7 chars wide, you can view your preference without forcing it on
the rest of the world. It's a logical rather than a fixed encoding.


> There's no use in typing "tab" for indentation when my text editor will
> simply convert it to three spaces, or worse, autoindent and mix tabs
> with spaces so that I have no idea how many actual whitespace characters
> of what kinds are really taking up all that whitespace. I admit it
> doesn't usually matter, but then you go back to try and make your code
> prettier and find yourself asking "WTF?"

Sounds like the problem is your editor, not tabs. But I wouldn't rule out
PEBCAK either. ;)


> Undoubtedly adding the second spark to the holy war,

Undoubtedly. Let's keep it civil, shall we? And please limit the
cross-posting to a minimum. (directed at the group, not you personally
Eli).

--
Edward Elliott
UC Berkeley School of Law (Boalt Hall)
complangpython at eddeye dot net

David Steuber

unread,
May 15, 2006, 12:35:42 AM5/15/06
to
Spaces work better. Hitting the TAB key in my Emacs will auto-indent
the current line. Only spaces will be used for fill. The worst thing
you can do is mix the two regardless of how you feel about tab vs
space.

The next step in evil is to give tab actual significance like in
make.

Xah Lee is getting better at trolling. He might fill up Google's
storage.

--
http://www.david-steuber.com/
1998 Subaru Impreza Outback Sport
2006 Honda 599 Hornet (CB600F) x 2 Crash & Slider
The lithobraker. Zero distance stops at any speed.

jmcgill

unread,
May 15, 2006, 12:43:35 AM5/15/06
to

If I work on your project, I follow the coding and style standards you
specify.

Likewise if you work on my project you follow the established standards.

Fortunately for you, I am fairly liberal on such matters.

I like to see 4 spaces for indentation. If you use tabs, that's what I
will see, and you're very likely to have your code reformatted by the
automated build process, when the standard copyright header is pasted
and missing javadoc tags are generated as warnings.

I like the open brace to start on the line of the control keyword. I
can deal with the open brace being on the next line, at the same level
of indentation as the control keyword. I don't quite understand the
motivation behind the GNU style, where the brace itself is treated as a
half-indent, but I can live with it on *your* project.

Any whitespace or other style that isn't happy to be reformatted
automatically is an error anyway.

I'd be very laissez-faire about it except for the fact that code
repositories are much easier to manage if everything is formatted before
it goes in, or as a compromise, as a step at release tags.

mystilleef

unread,
May 15, 2006, 3:56:29 AM5/15/06
to
I agree, use tabs.

Mumia W.

unread,
May 15, 2006, 4:00:14 AM5/15/06
to
Xah Lee wrote:
> Tabs versus Spaces in Source Code
>
> Xah Lee, 2006-05-13
>
> In coding a computer program, there's often the choices of tabs or
> spaces for code indentation. There is a large amount of confusion about
> which is better. It has become what's known as “religious war” —
> a heated fight over trivia. In this essay, i like to explain what is
> the situation behind it, and which is proper.
>

Thanks Xah. I value your posts. Keep posting. And since your posts
usually cover broad areas of CS, keep crossposting. Don't go anywhere
Xah :-)


> Simply put, tabs is proper, and spaces are improper. Why? This may seem
> ridiculously simple given the de facto ball of confusion: the semantics
> of tabs is what indenting is about, while, using spaces to align code
> is a hack.
>

I wouldn't say that spaces are a hack, but tabs are superior.

> Now, tech geekers may object this simple conclusion because they itch
> to drivel about different editors and so on. The alleged problem
> created by tabs as seen by the industry coders are caused by two
> things: (1) tech geeker's sloppiness and lack of critical thinking
> which lead them to not understanding the semantic purposes of tab and
> space characters. (2) Due to the first reason, they have created and
> propagated a massive none-understanding and mis-use, to the degree that
> many tools (e.g. vi) does not deal with tabs well and using spaces to
> align code has become widely practiced, so that in the end spaces seem
> to be actually better by popularity and seeming simplicity.
>

Don't forget the laziness of programmers like me who don't put the
tabbing information in the source file. Vim deals with tabs well IMO,
but I almost never used to put the right auto-commands in the file to
get it set up right for other users.

> In short, this is a phenomenon of misunderstanding begetting a snowball
> of misunderstanding, such that it created a cultural milieu to embrace
> this malpractice and kick what is true or proper. Situations like this
> happens a lot in unix. For one non-unix example, is the file name's
> suffix known as “extension”, where the code of file's type became
> part of the file name. (e.g. “.txt”, “.html”, “.jpg”).
> Another well-known example is HTML practices in the industry, where
> badly designed tags from corporation's competitive greed, and stupid
> coding and misunderstanding by coders and their tools are so
> wide-spread such that they force the correct way to the side by the
> eventual standardization caused by sheer quantity of inproper but set
> practice.
>
> Now, tech geekers may still object, that using tabs requires the
> editors to set their positions, and plain files don't carry that
> information. This is a good question, and the solution is to advance
> the sciences such that your source code in some way embed such
> information.

Vim does this. We just have to use it.

> This would be progress. However, this is never thought of
> because the “unix philosophies” already conditioned people to hack
> and be shallow. In this case, many will simply use the character
> intended to separate words for the purpose of indentation or alignment,
> and spread the practice with militant drivels.
>
> Now, given the already messed up situation of the tabs vs spaces by the
> unixers and unix brain-washing of the coders in the industry... Which
> should we use today? I do not have a good proposition, other than just
> use whichever that works for you but put more critical thinking into
> things to prevent mishaps like this.
>
> Tabs vs Spaces can be thought of as parameters vs hard-coded values, or
> HTML vs ascii format, or XML/CSS vs HTML 4, or structural vs visual, or
> semantic vs format. In these, it is always easy to convert from the
> former to the latter, but near impossible from the latter to the
> former. And, that is because the former encodes information that is
> lost in the latter.

Nope. Conversion is relatively easy. I've written programs to do this
myself, and everyone and his brother has also done this. Virtually every
programmer's editor that I've ever used can do this, and a great, great
many independent programs convert tabs to spaces. It's like saying,
"it's near impossible to write a calculator program." :-)

I bet that someone has a Perl one-liner to do it.

On any Debian system, try a "man expand" and see what you find. Also,
emacs and vim do it. Perl has a Text::Tabs module. TCL's
::textutil::(un)?tabify routines do it. The birds do it, and the bees do
it. Oh wait, that's something else :-)

> If we look at the issue of tabs vs spaces, indeed,
> it is easy to convert tabs to spaces in a source code, but more
> difficult to convert from spaces to tabs.

Nope again. It's easy, you just keep track of the virtual character
position as you decide whether to write a space or a tab. Computers do
the "counting" thing fairly well.

> Because, tabs as indentation
> actually contains the semantic information about indentation. With
> spaces, this critical information is lost in space.
>
> This issue is intimately related to another issue in source code:
> soft-wrapped lines versus physical, hard-wrapped lines by EOL (end of
> line character). This issue has far more consequences than tabs vs
> spaces, and the unixer's unthinking has made far-reaching damages in
> the computing industry. Due to unix's EOL ways of thinking, it has
> created languages based on EOL (just about ALL languages except the
> Lisp family and Mathematica) and tools based on EOL (cvs, diff, grep,
> and basically every tool in unix), thoughts based on EOL (software
> value estimation by counting EOL, hard-coded email quoting system by
> “>” prefix, and silent line-truncations in many unix tools), such
> that any progress or development towards a “algorithmic code unit”
> concept or language syntaxes are suppressed. I have not written a full
> account on this issue, but i've touched it in this essay: “The Harm
> of hard-wrapping Lines”, at
> http://xahlee.org/UnixResource_dir/writ/hard-wrap.html
> ----
> This post is archived at:
> http://xahlee.org/UnixResource_dir/writ/tabs_vs_spaces.html
>
> Xah
> x...@xahlee.org
> ∑ http://xahlee.org/
>

I've never thought of tabs-vs-spaces as a religious war. Anyway, the
authority of the programming environment will determine which one is
used. Have a good week Xah.

Edward Elliott

unread,
May 15, 2006, 1:39:22 PM5/15/06
to
I already made my point about using tabs, so I only want to address Xah's
proposed solution of embedding tab width information in the source code.

Mumia W. wrote:

> Don't forget the laziness of programmers like me who don't put the
> tabbing information in the source file. Vim deals with tabs well IMO,
> but I almost never used to put the right auto-commands in the file to
> get it set up right for other users.

auto-commands don't belong in source files. when tabs are used
consistently, there's nothing for you to setup for other users. and in
python it would only make things worse because the interpreter has a fixed
interpretation of tabs (8 spaces iirc).

>> Now, tech geekers may still object, that using tabs requires the
>> editors to set their positions, and plain files don't carry that
>> information. This is a good question, and the solution is to advance
>> the sciences such that your source code in some way embed such
>> information.
>
> Vim does this. We just have to use it.

Embedding the information in the file is the wrong approach. It trades the
utility of tabs as a logical indentation unit for a fixed value. It's
really no different than converting tabs to spaces in the first place. Tab
is a logical unit, it doesn't matter whether an editor displays them as 2,
5, or 14 spaces when they are used properly.

If you want a fixed width, use spaces. That's what they're for. There's no
point to another fixed-width character that changes with every file.

>> If we look at the issue of tabs vs spaces, indeed,
>> it is easy to convert tabs to spaces in a source code, but more
>> difficult to convert from spaces to tabs.
>
> Nope again. It's easy, you just keep track of the virtual character
> position as you decide whether to write a space or a tab. Computers do
> the "counting" thing fairly well.

There are complications when converting spaces to tabs. Say most lines are
indented by a multiple of 4 spaces, so you set tabwidth to 4. Now you find
a line indented 6 spaces. Should that be 1 tab, 2 tabs, 1 tab and two
spaces, or something else? It's easy to make an arbitrary choice among
those options, it's harder to deduce the intent behind this odd line. Is
it a mistake? A signal of some sort? That's not something you can decide
programmatically. I think that's what Xah was saying in a more general
way. Spaces and tabs carry different information that's not always
compatible.

Big and Blue

unread,
May 15, 2006, 3:03:16 PM5/15/06
to
Edward Elliott wrote:

> What you see as tabs' weakness is their strength. They encode '1 level of
> indentation', not a fixed width. Of course tabs are rendered differently
> by different editors -- that's the point.

Yes - it is the point. It is the point why you should never use them.
The meaning of the tab is writer-dependent while the
representation/interpretation of a tab is reader-dependent. What you think
is lined up nicely will be mis-aligned for someone else.

If you wish to allow your meaning to be changed by any and every reader
then reconsider what you are writing.


--
Just because I've written it doesn't mean that
either you or I have to believe it.

Iain King

unread,
May 16, 2006, 5:39:04 AM5/16/06
to
Oh God, I agree with Xah Lee. Someone take me out behind the chemical
sheds...

Iain

numeromancer

unread,
May 16, 2006, 9:48:43 AM5/16/06
to
An old debate. My $0.02 :

http://numeromancer.dyndns.org/~timothy/tab-width-independence/description.html

The idea can be extended to other programming languages.

TS

Oliver Bandel

unread,
May 16, 2006, 11:23:13 AM5/16/06
to
Xah Lee wrote:
> Tabs versus Spaces in Source Code
>
> Xah Lee, 2006-05-13
>
> In coding a computer program, there's often the choices of tabs or
> spaces for code indentation. There is a large amount of confusion about
> which is better. It has become what's known as “religious war” —
> a heated fight over trivia. In this essay, i like to explain what is
> the situation behind it, and which is proper.
>
> Simply put, tabs is proper, and spaces are improper.
[...]

I fullheartedly disagree :)

So, no "essay" on this is necessary to read :->


Ciao,
Oliver

opalpa@gmail.com opalinski from opalpaweb

unread,
May 16, 2006, 11:31:31 AM5/16/06
to
> Simply put, tabs is proper, and spaces are improper.
> Why? This may seem
> ridiculously simple given the de facto ball of confusion: the semantics
> of tabs is what indenting is about, while, using spaces to align code
> is a hack.

The reality of programming practice trumps original intent of tab
characters. The tab character and space character are pliable in that
if their use changes their semantics change.

> ... and the solution is to advance


> the sciences such that your source code in some way
> embed such information.

If/when time comes where such info is embeded perhaps then tabs will be
OK.

---------------------------------------------------------------

I use spaces because of the many sources I've opened I have many times
sighed on opening tabed ones and never done so opening spaced ones.

I don't get mad, but sighing is a clear indicator of negativity.
Anyway, the more code I write and read the less indentation matters to
me. My brain can now parse akward source correctly far bettter than it
did a few years ago.


All the best,
Opalinski
opa...@gmail.com
http://www.geocities.com/opalpaweb/

Pascal Bourguignon

unread,
May 16, 2006, 11:40:54 AM5/16/06
to

And anyways, C-x h C-M-\ comes automatically after C-x C-f source RET
Just add this to your ~/.emacs :

(add-hook 'find-file-hook
(lambda () (indent-region (point-min) (point-max)) (pop-mark)))

--
__Pascal Bourguignon__ http://www.informatimago.com/

IMPORTANT NOTICE TO PURCHASERS: The entire physical universe,
including this product, may one day collapse back into an
infinitesimally small space. Should another universe subsequently
re-emerge, the existence of this product in that universe cannot be
guaranteed.

Dale King

unread,
May 16, 2006, 12:14:17 PM5/16/06
to
Iain King wrote:
> Oh God, I agree with Xah Lee. Someone take me out behind the chemical
> sheds...
>
> Xah Lee wrote:
<more worthless nonsense>

Please don't feed the troll!

And for the record, spaces are 100% portable, tabs are not. That ends
the argument for me.

Worse than either tabs or spaces however is Sun's mixture of the two.
--
Dale King

Oliver Bandel

unread,
May 16, 2006, 12:15:23 PM5/16/06
to
opa...@gmail.com opalinski from opalpaweb wrote:

>>Simply put, tabs is proper, and spaces are improper.
>>Why? This may seem
>>ridiculously simple given the de facto ball of confusion: the semantics
>>of tabs is what indenting is about, while, using spaces to align code
>>is a hack.
>
>
> The reality of programming practice trumps original intent of tab
> characters. The tab character and space character are pliable in that
> if their use changes their semantics change.

[...]


Yes, as I started programming I also preferred tabs.
And with growing experience on how to handle this in true life
(different editors/systems/languages...) I saw, that
converting the "so fine tabs" was annoying.

The only thing that always worked were spaces.
Tab: nice idea but makes programming an annoyance.

Ciao,
Oliver

Kaz Kylheku

unread,
May 16, 2006, 1:51:35 PM5/16/06
to
Xah Lee wrote:
> Tabs vs Spaces can be thought of as parameters vs hard-coded values, or
> HTML vs ascii format, or XML/CSS vs HTML 4, or structural vs visual, or
> semantic vs format. In these, it is always easy to convert from the
> former to the latter, but near impossible from the latter to the
> former.

Bahaha, looks like someone hasn't thought things through very well.

Spaces, under a mono font, offer greater precision and expressivity in
achieving specific alignment. That expressivity cannot be captured by
tabs.

The difficulty in converting spaces to tabs rests not in any bridgeable
semantic gap, but in the lack of having any way whatsoever to express
using tabs what the spaces are expressing.

It's not /near/ impossible, it's /precisely/ impossible.

For instance, tabs cannot express these alignments:

/*
* C block
* comment
* in a common style.
*/

(lisp
(nested list
with symbols
and things))

(call to a function
with many parameters)
;; how do you align "to" and "with" using tabs?
;; only if "to" lands on a tab stop; but dependence on specific tab
stops
;; destroys the whole idea of tabs being parameters.

To do these alignments structurally, you need something more expressive
than spaces or tabs. But spaces do the job under a mono font, /and/
they do it in a completely unobtrusive way.

If you want to do nice typesetting of code, you have to add markup
which has to be stripped away if you actually want to run the code.

Spaces give you decent formatting without markup. Tabs do not. Tabs are
only suitable for aligning the first non-whitespace character of a line
to a stop. Only if that is the full extent of the formatting that you
need to express in your code can you acheive the ideal of being able to
change your tab parameter to change the indentation amount. If you need
to align characters which aren't the first non-whitespace in a line,
tabs are of no use whatsoever, and proportional fonts must be banished.

achates

unread,
May 16, 2006, 2:46:01 PM5/16/06
to
Kaz Kylheku wrote:

> If you want to do nice typesetting of code, you have to add markup
> which has to be stripped away if you actually want to run the code.

Typesetting code is not a helpful activity outside of the publishing
industry. You might like the results of your typsetting; I happen not
to. You probably wouldn't like mine. Does that mean we shouldn't work
together? Only if you insist on forcing me to conform to your way of
displaying code.

You are correct in pointing out that tabs don't allow for 'alignment'
of the sort you mention:


(lisp
(nested list
with symbols
and things))

But then neither does Python. I happen to think that's a feature.

(And of course you can do what you like inside a comment. That's
because tabs are for indentation, and indentation is meanigless in that
context. Spaces are exactly what you should use then. I may or may not
like your layout, but it won't break anything when we merge our code.)

achates

unread,
May 16, 2006, 3:22:06 PM5/16/06
to
argh, sorry; missed the cross-post. Was replying from comp.lang.python..

foo bar baz qux

unread,
May 16, 2006, 4:40:44 PM5/16/06
to
Mumia W. wrote:

> Xah Lee wrote:
> >
> > In coding a computer program, there's often the choices of tabs or
> > spaces for code indentation. There is a large amount of confusion about
> > which is better. It has become what's known as "religious war" -

> > a heated fight over trivia. In this essay, i like to explain what is
> > the situation behind it, and which is proper.
> >
>
> Thanks Xah. I value your posts. Keep posting. And since your posts
> usually cover broad areas of CS, keep crossposting. Don't go anywhere
> Xah :-)

<barf> I hope that is sarcasm. The public licking of troll's arses is
rather unedifying.


> > Simply put, tabs is proper, and spaces are improper. Why?

I find it hard to pay much attention to someone who aspires to lecture
the masses yet is unable to grasp rudimentary English language concepts
such as singular and plural. I think Xah should start with the correct
use of the plural form before he moves on to a study of tabs and
spaces.


> > This may seem ridiculously simple given the de facto ball of confusion:

Yeah but what about the cube of confusion? How about other platonic
solids of confusion?


> > the semantics of tabs is what indenting is about,

And I thought indenting was about the semantics of the *program* not
the *tabs*. I do wonder whether Xah is able to express fully the
meaning of the tabulation character through the indenting of his
source code.


> > Now, tech geekers may object this simple conclusion because they itch
> > to drivel about different editors and so on.

Xah is a prime example of a "tech geek" driveling about this sort of
trivia.


> > In short, this is a phenomenon of misunderstanding begetting a snowball
> > of misunderstanding, such that it created a cultural milieu to embrace
> > this malpractice and kick what is true or proper.

A fine example of convoluted self parody.


> > Situations like this happens a lot in unix.

Ooh ooh lets have an example ...

> > For one non-unix example, is the file name's suffix known as "extension",

Err, not Unix.

> > Another well-known example is HTML practices in the industry, where
> > badly designed tags

Err, still not Unix.

<snip: Xah's recipe for world peace>


> > This would be progress. However, this is never thought of
> > because the "unix philosophies" already conditioned people to hack
> > and be shallow. In this case, many will simply use the character
> > intended to separate words for the purpose of indentation or alignment,

The first typewriters had a space-bar but no tabulation key. Therefore
we can see that the space character was invented for indentation and
alignment as well as for separating words.

>From it's name we can deduce that the tabulation key was intended
primarily for laying out tables of information in typewritten
documents. It wasn't named "the indentation key".


> > and spread the practice with militant drivels.

Unlike Xah?, who wouldn't post "militant drivel?"


> > Which should we use today?
> > I do not have a good proposition

Few should be surprised at this.


> > Because, tabs as indentation
> > actually contains the semantic information about indentation.

Xah asserts that an indentation of n * char(x) carries more meaning
than an indentation of n * char(y)?


Mumia - repent!

Kaz Kylheku

unread,
May 16, 2006, 6:01:18 PM5/16/06
to
achates wrote:
> Kaz Kylheku wrote:
>
> > If you want to do nice typesetting of code, you have to add markup
> > which has to be stripped away if you actually want to run the code.
>
> Typesetting code is not a helpful activity outside of the publishing
> industry.

Be that as it may, code writing involves an element of typesetting. If
you are aligning characters, you are typesetting, however crudely.

> You might like the results of your typsetting; I happen not
> to. You probably wouldn't like mine. Does that mean we shouldn't work
> together? Only if you insist on forcing me to conform to your way of
> displaying code.

Someone who insists that everyone should separate line indentation into
tabs which achieve the block level, and spaces that achieve additional
alignment, so that code could be displayed in more than one way based
on the tab size without loss of alignment, is probably a "space cadet",
who has a bizarre agenda unrelated to developing the product.

There is close to zero value in maintaining such a scheme, and
consequently, it's hard to justify with a business case.

Yes, in the real world, you have to conform to someone's way of
formatting and displaying code. That's how it is.

You have to learn to read, write and even like more than one style.

> You are correct in pointing out that tabs don't allow for 'alignment'
> of the sort you mention:

That alignment has a name: hanging indentation.

All forms of aligning the first character of a line to some requirement
inherited from the previous line are called indentation.

Granted, a portion of that indentation is derived from the nesting
level of some logically enclosing programming language construct, and
part of it may be derived from the position of a character of some
parallel constituent within the construct.

> (lisp
> (nested list
> with symbols
> and things))
> But then neither does Python. I happen to think that's a feature.

Python has logical line continuation which gives rise to the need for
hanging indents to line up with parallel constituents in a folded
expression.

Python also allows for the possibility of statements separated by
semicolons on one line, which may need to be lined up in columns.

var = 42; foo = 53
x = 2; y = 10

> (And of course you can do what you like inside a comment. That's
> because tabs are for indentation, and indentation is meanigless in that
> context.

A comment can contain example code, which contains indentation.

What, I can't change the tab size to display that how I want? Waaah!!!
(;_;)

Aaron Gray

unread,
May 16, 2006, 8:14:07 PM5/16/06
to
I was once a religous tabber until working on multiple source code sources,
now I am a religious spacer :)

My 2bits worth,

Aaron


Bill Pursell

unread,
May 17, 2006, 9:51:19 AM5/17/06
to

Xah Lee wrote:
> Tabs versus Spaces in Source Code
>
> Xah Lee, 2006-05-13
>
> In coding a computer program, there's often the choices of tabs or
> spaces for code indentation.
<snip>

> (2) Due to the first reason, they have created and
> propagated a massive none-understanding and mis-use, to the degree that
> many tools (e.g. vi) does not deal with tabs well

:set ts=<n>

Yeah, that's really tough. vi does just fine handling tabs. vim does
an even better job, with mode-lines, = and :retab.

In my experience, the people who complain about the use
of tabs for indentation are the people who don't know
how to use their editor, and those people tend to use
emacs.

Big and Blue

unread,
May 17, 2006, 6:00:51 PM5/17/06
to
Bill Pursell wrote:
>
> In my experience, the people who complain about the use
> of tabs for indentation are the people who don't know
> how to use their editor, and those people tend to use
> emacs.

So, the people who do use tabs are those who think they should be able
to force their view/setting onto others.

I prefer to work *with* people, not *against* them. My experience is
that tabs can (and hence do) cause problems. Spaces do not. It's seems
that those who work with code from many sources, rather than just their own,
prefer spaces.

Edmond Dantes

unread,
May 17, 2006, 6:40:40 PM5/17/06
to
Oliver Bandel wrote:

It all depends on your editor of choice. Emacs editing of Lisp (and a few
other languages, such as Python) makes the issue more or less moot. I
personally would recommend choosing one editor to use with all your
projects, and Emacs is wonderful in that it has been ported to just about
every platform imaginable.

The real issue is, of course, that ASCII is showing its age and we should
probably supplant it with something better. But I know that will never fly,
given the torrents of code, configuration files, and everything else in
ASCII. Even Unicode couldn't put a dent in it, despite the obvious growing
global development efforts. Not sure how many compilers would be able to
handle Unicode source anyway. I suspect the large majority of them would
would choke big time.

Oh well...

--
-- Edmond Dantes, CMC
And Now for something Completely Different:
http://gift-basket.prosperitysprinkler.com
http://sewing-machine.womencraft.com
http://coveralls.whiteboystuff.com
http://eyewear.blackboystuff.com
http://dinette.funiturenow.com
http://wheels.whiteboystuff.com
http://patio.funiturenow.com


Posted Via Usenet.com Premium Usenet Newsgroup Services
----------------------------------------------------------
** SPEED ** RETENTION ** COMPLETION ** ANONYMITY **
----------------------------------------------------------
http://www.usenet.com

Edward Elliott

unread,
May 17, 2006, 7:52:48 PM5/17/06
to
<posted & mailed>

Edmond Dantes wrote:

> in ASCII. Even Unicode couldn't put a dent in it, despite the obvious
> growing global development efforts. Not sure how many compilers would be
> able to handle Unicode source anyway. I suspect the large majority of them
> would would choke big time.

The real problem isn't the compilers, it's these damn ASCII keyboards. Show
me a workable Unicode replacement and we'll talk. :)

Message has been deleted

John Bokma

unread,
May 17, 2006, 8:26:01 PM5/17/06
to
Edmond Dantes <edm...@le-comte-de-monte-cristo.biz> wrote:

> despite the obvious growing global development efforts. Not sure how
> many compilers would be able to handle Unicode source anyway.

javac does afaik.

--
John Bokma Freelance software developer
&
Experienced Perl programmer: http://castleamber.com/

ashesh

unread,
May 18, 2006, 1:45:27 AM5/18/06
to


Ashesh..

Alain Picard

unread,
May 18, 2006, 5:46:10 AM5/18/06
to
"Bill Pursell" <bill.p...@gmail.com> writes:

> In my experience, the people who complain about the use
> of tabs for indentation are the people who don't know
> how to use their editor, and those people tend to use
> emacs.

HA HA HA HA HA HA HA HA HA HA HA HA ....

Tee, hee heee.... snif!

Phew. Better now.

That was funny! Thanks! :-)

Pascal Bourguignon

unread,
May 18, 2006, 7:10:13 AM5/18/06
to
Edmond Dantes <edm...@le-comte-de-monte-cristo.biz> writes:
> It all depends on your editor of choice. Emacs editing of Lisp (and a few
> other languages, such as Python) makes the issue more or less moot. I
> personally would recommend choosing one editor to use with all your
> projects, and Emacs is wonderful in that it has been ported to just about
> every platform imaginable.
>
> The real issue is, of course, that ASCII is showing its age and we should
> probably supplant it with something better. But I know that will never fly,
> given the torrents of code, configuration files, and everything else in
> ASCII. Even Unicode couldn't put a dent in it, despite the obvious growing
> global development efforts. Not sure how many compilers would be able to
> handle Unicode source anyway. I suspect the large majority of them would
> would choke big time.

All right unicode support is not 100% perfect already, but my main
compilers support it perfectly well, only 1/5 don't support it, and
1/5 support it partially:

------(unicode-script.lisp)---------------------------------------------

(defun clisp (file)
(ext:run-program "/usr/local/bin/clisp"
:arguments (list "-ansi" "-norc" "-on-error" "exit"
"-E" "utf-8"
"-i" file "-x" "(ext:quit)")
:input nil :output :terminal :wait t))

(defun gcl (file)
(ext:run-program "/usr/local/bin/gcl"
:arguments (list "-batch"
"-load" file "-eval" "(lisp:quit)")
:input nil :output :terminal :wait t))

(defun ecl (file)
(ext:run-program "/usr/local/bin/ecl"
:arguments (list "-norc"
"-load" file "-eval" "(si:quit)")
:input nil :output :terminal :wait t))

(defun sbcl (file)
(ext:run-program "/usr/local/bin/sbcl"
:arguments (list "--userinit" "/dev/null"
"--load" file "--eval" "(sb-ext:quit)")
:input nil :output :terminal :wait t))

(defun cmucl (file)
(ext:run-program "/usr/local/bin/cmucl"
:arguments (list "-noinit"
"-load" file "-eval" "(extensions:quit)")
:input nil :output :terminal :wait t))


(dolist (implementation '(clisp gcl ecl sbcl cmucl))
(sleep 3)
(terpri) (print implementation) (terpri)
(funcall implementation "unicode-source.lisp"))

------(unicode-source.lisp)---------------------------------------------
;; -*- coding: utf-8 -*-

(eval-when (:compile-toplevel :load-toplevel :execute)
(format t "~2%~A ~A~2%"
(lisp-implementation-type)
(lisp-implementation-version))
(finish-output))


(defun ιοτα (&key (номер 10) (단계 1) (בכוכ 0))
(loop :for i :from בכוכ :to номер :by 단계 :collect i))


(defun test ()
(format t "~%Calling ~S --> ~A~%"
'(ιοτα :номер 10 :단계 2 :בכוכ 2)
(ιοτα :номер 10 :단계 2 :בכוכ 2)))

(test)

------------------------------------------------------------------------

(load"unicode-script.lisp")
;; Loading file unicode-script.lisp ...

CLISP
i i i i i i i ooooo o ooooooo ooooo ooooo
I I I I I I I 8 8 8 8 8 o 8 8
I \ `+' / I 8 8 8 8 8 8
\ `-+-' / 8 8 8 ooooo 8oooo
`-__|__-' 8 8 8 8 8
| 8 o 8 8 o 8 8
------+------ ooooo 8oooooo ooo8ooo ooooo 8

Copyright (c) Bruno Haible, Michael Stoll 1992, 1993
Copyright (c) Bruno Haible, Marcus Daniels 1994-1997
Copyright (c) Bruno Haible, Pierpaolo Bernardi, Sam Steingold 1998
Copyright (c) Bruno Haible, Sam Steingold 1999-2000
Copyright (c) Sam Steingold, Bruno Haible 2001-2006

;; Loading file unicode-source.lisp ...

CLISP 2.38 (2006-01-24) (built 3347193361) (memory 3347193794)


Calling (ΙΟΤΑ :НОМЕР 10 :단계 2 :בכוכ 2) --> (2 4 6 8 10)
;; Loaded file unicode-source.lisp
Bye.


GCL


GNU Common Lisp (GCL) GCL 2.6.7


Calling (ιοτα :номер 10 :단계 2 :בכוכ 2) --> (2 4 6 8
10)


ECL
;;; Loading "unicode-source.lisp"


ECL 0.9g


Calling (ιοτα :номер 10 :단계 2 :בכוכ 2) --> (2 4 6 8 10)


SBCL
This is SBCL 0.9.12, an implementation of ANSI Common Lisp.
More information about SBCL is available at <http://www.sbcl.org/>.

SBCL is free software, provided as is, with absolutely no warranty.
It is mostly in the public domain; some portions are provided under
BSD-style licenses. See the CREDITS and COPYING files in the
distribution for more information.


SBCL 0.9.12


Calling (|ιοτα| :|номер| 10 :|ˋ¨ʳ„| 2 :|בכוכ| 2) --> (2 4 6 8 10)


CMUCL
; Loading #P"/local/users/pjb/src/lisp/encours/unicode-source.lisp".


CMU Common Lisp 19c (19C)


Reader error at 214 on #<Stream for file "/local/users/pjb/src/lisp/encours/unicode-source.lisp">:
Undefined read-macro character #\ÃŽ
[Condition of type READER-ERROR]

Restarts:
0: [CONTINUE] Return NIL from load of "unicode-source.lisp".
1: [ABORT ] Skip remaining initializations.

Debug (type H for help)

(LISP::%READER-ERROR
#<Stream for file "/local/users/pjb/src/lisp/encours/unicode-source.lisp">
"Undefined read-macro character ~S"
#\ÃŽ)
Source: Error finding source:
Error in function DEBUG::GET-FILE-TOP-LEVEL-FORM: Source file no longer exists:
target:code/reader.lisp.
0] abort
*
Received EOF on *standard-input*, switching to *terminal-io*.
* (extensions:quit)
;; Loaded file unicode-script.lisp
T
[4]>


--
__Pascal Bourguignon__ http://www.informatimago.com/

Grace personified,
I leap into the window.
I meant to do that.

Ben Morrow

unread,
May 17, 2006, 8:58:57 PM5/17/06
to

Quoth John Bokma <jo...@castleamber.com>:

> Edmond Dantes <edm...@le-comte-de-monte-cristo.biz> wrote:
>
> > despite the obvious growing global development efforts. Not sure how
> > many compilers would be able to handle Unicode source anyway.
>
> javac does afaik.

Perl (yes, it *is* a compiler)
Haskell compilers ought to (it's in the language spec).

I wouldn't be *that* surprised if the next rev of C (if there is one)
had the source in Unicode. It's the way the world's going.

Ben

--
Heracles: Vulture! Here's a titbit for you / A few dried molecules of the gall
From the liver of a friend of yours. / Excuse the arrow but I have no spoon.
(Ted Hughes, [ Heracles shoots Vulture with arrow. Vulture bursts into ]
'Alcestis') [ flame, and falls out of sight. ] benm...@tiscali.co.uk

Jonathon McKitrick

unread,
May 18, 2006, 10:42:08 AM5/18/06
to
Pascal Bourguignon wrote:
> (defun ιοτα (&key (номер 10) (단계 1) (בכוכ 0))
> (loop :for i :from בכוכ :to номер :by 단계 :collect i))

How do you even *enter* these characters? My browser seems to trap all
the special character combinations, and I *know* you don't mean
selecting from a character palette.

࿿ hey, this is weird...

î

I've got something happening, but I can't tell what.

Yes, I'm an ignorant Western world ASCII user. :-)

Pascal Bourguignon

unread,
May 18, 2006, 1:24:15 PM5/18/06
to
"Jonathon McKitrick" <j_mck...@bigfoot.com> writes:

> Pascal Bourguignon wrote:
>> (defun ιοτα (&key (номер 10) (단계 1) (בכוכ 0))
>> (loop :for i :from בכוכ :to номер :by 단계 :collect i))
>
> How do you even *enter* these characters? My browser seems to trap all
> the special character combinations, and I *know* you don't mean
> selecting from a character palette.

Why? Of course!
Aren't you either an emacs or a Mac user?

On a Mac, you just select the input keyboad from the Input menu (the
little flag on the right of the menubar, you may activate it from the
International System Preference panel).

On emacs, it's as simple: M-x set-input-method RET

I've bound C-F9, C-F10, C-F11, and C-F12 to various input methods:

(global-set-key [C-f9] (lambda()(interactive)(set-input-method 'chinese-py-b5)))
(global-set-key [C-f10] (lambda()(interactive)(set-input-method 'cyrillic-yawerty)))
(global-set-key [C-f11] (lambda()(interactive)(set-input-method 'greek)))
(global-set-key [C-f12] (lambda()(interactive)(set-input-method 'hebrew)))

C-\ is bound to toggle-input-method which allows to revert back to the
usual input method.

For the alphabetic scripts, there's no difficulty, it's like with
roman scripts: each key is a character. For ideographic scripts, the
input methods are more sophisticated.

Then, you have to learn some of these strange languages. I learned
several (but I forgot everything but: לודג גד דג ינד, здраствуйте, я
люблю тибе, 我 聽龍, 我 不 中国人). For the Korean, I copy-and-pasted
it from some web translation service. But keying them in is the
easiest part.

--
__Pascal Bourguignon__ http://www.informatimago.com/

Cats meow out of angst
"Thumbs! If only we had thumbs!
We could break so much!"

Oliver Bandel

unread,
May 18, 2006, 2:25:14 PM5/18/06
to
Edmond Dantes wrote:

> Oliver Bandel wrote:
>
>
>>opa...@gmail.com opalinski from opalpaweb wrote:
>
> ...
>
>>Yes, as I started programming I also preferred tabs.
>>And with growing experience on how to handle this in true life
>>(different editors/systems/languages...) I saw, that
>>converting the "so fine tabs" was annoying.
>>
>>The only thing that always worked were spaces.
>>Tab: nice idea but makes programming an annoyance.
>>
>>Ciao,
>> Oliver
>
>
> It all depends on your editor of choice. Emacs editing of Lisp (and a few
> other languages, such as Python) makes the issue more or less moot.

I not always have any editor at hand I want to.

As you might see above, I wrote that I also had to use
different editors...

..it does not help to have my editor on my system, when
I have to work elsewhere.

BTW: I doubt emacs can help always.

[...]


> The real issue is, of course, that ASCII is showing its age and we should
> probably supplant it with something better.

I doubt that unicode is the non-plus-ultra...

ASCII is fine, because it is a limited set.

KISS also makes sense in character encodings, IMHO.

BTW: Would unicode help in the Tab-vs-Space problem?
Or would we chose different spacing-characters then?
Maybe the problems will grow with the encoding capacity
of the char-encoding... who knows?


Ciao,
Oliver

Oliver Bandel

unread,
May 18, 2006, 2:31:53 PM5/18/06
to
Jonathon McKitrick wrote:

> Pascal Bourguignon wrote:
>
>>(defun ιοτα (&key (номер 10) (단계 1) (בכוכ 0))
>> (loop :for i :from בכוכ :to номер :by 단계 :collect i))
>
>
> How do you even *enter* these characters? My browser seems to trap all
> the special character combinations, and I *know* you don't mean
> selecting from a character palette.

Didn't you heard of that big keyboards?

12 meter x 2 meter wide I think.... you need a long
stick (maybe if you play golf, that can help).

The you have all UTF-8 characters there, that's fine,
but typing needs some time.
But it's good, because when ready with typing your email,
it's not necessary to go to sports after work. So your boss
can insist that you longer stay at work.


Ciao,
Oliver

;-)

Abigail

unread,
May 18, 2006, 4:24:28 PM5/18/06
to
Edmond Dantes (edm...@le-comte-de-monte-cristo.biz) wrote on MMMMDCXLII
September MCMXCIII in <URL:news:1147905290_7@news-east.n>:
^^ Oliver Bandel wrote:
^^
^^ > opa...@gmail.com opalinski from opalpaweb wrote:
^^ ...
^^ > Yes, as I started programming I also preferred tabs.
^^ > And with growing experience on how to handle this in true life
^^ > (different editors/systems/languages...) I saw, that
^^ > converting the "so fine tabs" was annoying.
^^ >
^^ > The only thing that always worked were spaces.
^^ > Tab: nice idea but makes programming an annoyance.
^^ >
^^ > Ciao,
^^ > Oliver
^^
^^ It all depends on your editor of choice.

It doesn't.

Tabs are evil. The argument[1] in favour of tabs is basically that
people cannot agree on how much indentation to use - and if people
use tabs, every programmer can set the tabstop to his or her preference.

But this is an utterly naive view. It would work if windows or screens
didn't have a right margin, or if code would automatically wrap nicely.
But life isn't such. Coders use a certain window size - with 80 characters
being very common. Imagine a program containing a line of code of 70
characters not counting leading or trailing whitespace. It's inside an if
statement, inside a loop in a subroutine, so it's intended three levels.
The programmer who wrote the subroutine uses 2 space indent. But naive
as he is, he believes in the 'tabs are good' fairy tail. So, he sets his
tabstop to 2, and the line of 70 characters is preceeded by three tabs.
The final character of the line is in column 76 - within the 80 character
limit. The next person working on the code prefers a 4 space indent. So
his tabstop is set to 4. His screen is also set to an 80 character width.
However, when he sees the line being discussed, it extends to column 82 -
OUTSIDE of the boundaries. Had the first programmer used spaces, this
problem wouldn't have happened.

Tabs are evil.

^^ The real issue is, of course, that ASCII is showing its age and we should
^^ probably supplant it with something better.

This has nothing to do with ASCII.

^^ But I know that will never fly,
^^ given the torrents of code, configuration files, and everything else in
^^ ASCII. Even Unicode couldn't put a dent in it, despite the obvious growing
^^ global development efforts. Not sure how many compilers would be able to
^^ handle Unicode source anyway. I suspect the large majority of them would
^^ would choke big time.

Tabs are a part of Unicode. In fact, *ALL* of ASCII is part of Unicode.
The first 128 code points in Unicode are the ASCII characters.

Abigail
--
$_ = "\112\165\163\1648\141\156\157\164\150\145\1628\120\145"
. "\162\1548\110\141\143\153\145\162\0128\177" and &japh;
sub japh {print "@_" and return if pop; split /\d/ and &japh}

Edward Elliott

unread,
May 18, 2006, 4:54:44 PM5/18/06
to
Abigail wrote:

> The programmer who wrote the subroutine uses 2 space indent. But naive
> as he is, he believes in the 'tabs are good' fairy tail. So, he sets his
> tabstop to 2, and the line of 70 characters is preceeded by three tabs.
> The final character of the line is in column 76 - within the 80 character
> limit. The next person working on the code prefers a 4 space indent. So
> his tabstop is set to 4. His screen is also set to an 80 character width.
> However, when he sees the line being discussed, it extends to column 82 -
> OUTSIDE of the boundaries. Had the first programmer used spaces, this
> problem wouldn't have happened.

There are a number of solutions to this problem besides banning tabs. The
second programmer can: set his tab width smaller for that file, use smarter
line wrapping, use a tool to reformat the entire file, use a wider
terminal, or just live with the long lines.

I'm not advocating for tabs, just pointing out your conclusion does not
follow from your premise.

Roedy Green

unread,
May 19, 2006, 9:44:10 PM5/19/06
to
On Mon, 15 May 2006 02:44:54 GMT, Eli Gottlieb <eligo...@gmail.com>
wrote, quoted or indirectly quoted someone who said :

>Actually, spaces are better for indenting code.

Agreed. All it takes is one programmer to use a different tab
expansion convention to screw up a project. Spaces are unambiguous.

Ideally though you should run code through a beautifier before checkin
to avoid false deltas with people manually formatting code slightly
differently.
--
Canadian Mind Products, Roedy Green.
http://mindprod.com Java custom programming, consulting and coaching.

Xah Lee

unread,
May 23, 2006, 7:02:42 AM5/23/06
to
the following are 2 FAQ following this thread. Thanks.

Addendum: 2006-05-15

Q: What you mean by embeding tab position info into the source code?
How's that gonna be done?

A: Tech geekers may not realize, but such embedding of meta info do
exist in many technologies by various means because of a need. For
example, Mac OS Classic's resource fork and Mac OS X's bundling system,
unix shell script's shebang (#!), emacs and Python's encoding
declaration “#-*- coding: utf-8 -*-”, Unicode's BOM, CVS's
change-log insertion, Mathematica's source code system the Notebook,
Microsoft Word's transparent meta data, as well as HTML and XML's
various declarations embedded in the file. Some of these systems are
good designs and some are hacks.

Somehow tech geekers have the sense that “source code” must be a
plain text file containing nothing else but the programing code. This
may be a defendable position, but as we can see in the above examples,
this idea is primitive and does not address the various needs. If the
tech geekers have thought out about these issues, computing languages
and its source code may have developed into more powerful and flexible
integrated systems as the above standardized examples. For instance,
many commercial development systems actually already have such
meta-data embodied with the source code. (e.g. Borland Delphi,
Metrowerks's CodeWarrior, Microsoft Visual Studio, Wolfram Research's
Mathematica.) Some of which, not only embody development-related info
such as debug points or linking files, but also allow programers to
high-light code for visual purposes like a word processor, or even
display them visually as type-set mathematics.

Q: Converting spaces to tabs is actually easy. I don't see how spacess
lose info.

A: Here is a illustration on how it is not possible to convert spaces
to tabs. Suppose you are writing in a language where the indentation is
part of the semantics, not just for appearance. Now, suppose you have
these two lines:

1234567890
A
B

The first line has 2 space prefix and second line has 4 space prefix.
How, if you convert this to tabs, how do you know that's 1 and 2 tabs,
or 2 and 4 tabs? In essence, there is no way to tell how many tabs n
represents, where n is the smallest space prefix in the code, unless n
== 1.

The above demonstrates the information loss in using spaces for
indentation in a theoretical way. There are also practical problems. In
practice, many languages allow string literals like this myName="i love
you", and strings easily can have a run of spaces. One cannot simply
run a blind find-n-replace operation to replace all spaces to tabs. But
also, many unix languages contains a so-called construct of
“heredoc” as a mean to embed a literal block of text. For example,
here's a PHP construct of heredoc:

$novelText = <<<arbitraryCharsHereAsDelimiter
(__)
(oo)
/-------\/
/ | ||
* ||----||
~~ ~~
arbitraryCharsHereAsDelimiter;
}

Regardless of its design as a language construct, the purpose of
“heredoc” is that it allows programers to easily embed a text (a
large string), without worrying about the text containing sequence of
characters that may be meaningful to the language. If a language has
heredoc construct, then it is basically impossible to convert from
spaces to tabs, as that will botch literal string embedded in heredoc.
However, it is less of a problem to convert tabs to spaces, because the
frequency of spaces appearing in literal strings are far higher than
literal tabs.

Another practical issue is error recovery. Suppose, one uses 4 spaces
for a indentation. Now, it is not uncommon to see lines with odd number
of space prefixes such as 7 or 10 out of common sloppiness. Such error
would happen more often if spaces are used for indentation, and the
essence is that tabs enforce a semantic association and is impossible
to make a half-indentation.

Q: Well, i just like spaces because they are most compatible.

A: Sure, crass simplicity is always more compatible. Suppose a unixer
will say, he doesn't like HTML because it is fret with problems and
incompatibilities. He'd rather prefer plain text. And, indeed, a lot
unixers seriously think that.

---------------------------
PS in the answer to the first question, i gave the following examples
of IDE/Language that actually embed formatting info in the source code:
Borland Delphi, Metrowerks's CodeWarrior, Microsoft Visual Studio,
Wolfram Research's Mathematica

actually, i know Mathematica does, but i'm not quite sure about the
other examples. So, my question is, does any one knows a language or
IDE that actually allows the coder to manually highlight parts of the
code and this highlight stick with the file upon reopening, as if a
word processor?

Xah
x...@xahlee.org
http://xahlee.org/

Xah Lee wrote:
> Tabs versus Spaces in Source Code

> This post is archived at:
> http://xahlee.org/UnixResource_dir/writ/tabs_vs_spaces.html

Mumia W.

unread,
May 23, 2006, 9:19:56 AM5/23/06
to
Xah Lee wrote:
> the following are 2 FAQ following this thread. Thanks.
>
> Addendum: 2006-05-15
>
> Q: What you mean by embeding tab position info into the source code?
> How's that gonna be done?
>
> A: Tech geekers may not realize, but such embedding of meta info do
> exist in many technologies by various means because of a need. For
> example, Mac OS Classic's resource fork and Mac OS X's bundling system,
> unix shell script's shebang (#!), emacs and Python's encoding
> declaration “#-*- coding: utf-8 -*-”, Unicode's BOM, CVS's
> change-log insertion, Mathematica's source code system the Notebook,
> Microsoft Word's transparent meta data, as well as HTML and XML's
> various declarations embedded in the file. Some of these systems are
> good designs and some are hacks.
>

Vim's mode-lines do this too.

> Somehow tech geekers have the sense that “source code” must be a
> plain text file containing nothing else but the programing code. This
> may be a defendable position, but as we can see in the above examples,
> this idea is primitive and does not address the various needs. If the
> tech geekers have thought out about these issues, computing languages
> and its source code may have developed into more powerful and flexible
> integrated systems as the above standardized examples.

The tech geekers have thought about it. Donald Knuth invented TeX, and
went on to invent the WEB literate programming system. You don't get any
geekier than that :)

> For instance,
> many commercial development systems actually already have such
> meta-data embodied with the source code. (e.g. Borland Delphi,
> Metrowerks's CodeWarrior, Microsoft Visual Studio, Wolfram Research's
> Mathematica.) Some of which, not only embody development-related info
> such as debug points or linking files, but also allow programers to
> high-light code for visual purposes like a word processor, or even
> display them visually as type-set mathematics.
>
> Q: Converting spaces to tabs is actually easy. I don't see how spacess
> lose info.
>
> A: Here is a illustration on how it is not possible to convert spaces
> to tabs. Suppose you are writing in a language where the indentation is
> part of the semantics, not just for appearance. Now, suppose you have
> these two lines:

I'd say that such a language removes the choice of whether to use tabs
or spaces, and the discussion is over when you don't have a choice.

>
> 1234567890
> A
> B
>
> The first line has 2 space prefix and second line has 4 space prefix.
> How, if you convert this to tabs, how do you know that's 1 and 2 tabs,
> or 2 and 4 tabs? In essence, there is no way to tell how many tabs n
> represents, where n is the smallest space prefix in the code, unless n
> == 1.

vim: tabstop=4

The argument for spaces over tabs says that you have to include some
metadata in order for the document to look right on other people's
computers if you use tabs. This example, plus my example mode-line for
vim, reinforces that idea IMO.

>
> The above demonstrates the information loss in using spaces for
> indentation in a theoretical way. There are also practical problems. In
> practice, many languages allow string literals like this myName="i love
> you", and strings easily can have a run of spaces. One cannot simply
> run a blind find-n-replace operation to replace all spaces to tabs. But
> also, many unix languages contains a so-called construct of
> “heredoc” as a mean to embed a literal block of text. For example,
> here's a PHP construct of heredoc:
>
> $novelText = <<<arbitraryCharsHereAsDelimiter
> (__)
> (oo)
> /-------\/
> / | ||
> * ||----||
> ~~ ~~
> arbitraryCharsHereAsDelimiter;
> }
>

Yes, there are lots of situations like this where you can't just
willy-nilly convert between tabs and spaces. But even in this case shows
that, if you use consistent tab widths, the text has a chance of
surviving. I converted your little doggie to and from text with tab
sizes of eight, and he survived. (I did it with tabs set to four too,
and it worked.)


> Regardless of its design as a language construct, the purpose of
> “heredoc” is that it allows programers to easily embed a text (a
> large string), without worrying about the text containing sequence of
> characters that may be meaningful to the language. If a language has
> heredoc construct, then it is basically impossible to convert from
> spaces to tabs, as that will botch literal string embedded in heredoc.

Yes it would. Upon printing, if the terminal tab width was set to eight,
but the text conversion was done with tabs at four, bye bye doggie.

> However, it is less of a problem to convert tabs to spaces, because the
> frequency of spaces appearing in literal strings are far higher than
> literal tabs.
>
> Another practical issue is error recovery. Suppose, one uses 4 spaces
> for a indentation. Now, it is not uncommon to see lines with odd number
> of space prefixes such as 7 or 10 out of common sloppiness. Such error
> would happen more often if spaces are used for indentation, and the
> essence is that tabs enforce a semantic association and is impossible
> to make a half-indentation.
>

What I've learned is that, if I'm going to use tabs for indentation, I
have to be consistent.

> Q: Well, i just like spaces because they are most compatible.
>
> A: Sure, crass simplicity is always more compatible. Suppose a unixer
> will say, he doesn't like HTML because it is fret with problems and
> incompatibilities. He'd rather prefer plain text. And, indeed, a lot
> unixers seriously think that.
>
> ---------------------------
> PS in the answer to the first question, i gave the following examples
> of IDE/Language that actually embed formatting info in the source code:
> Borland Delphi, Metrowerks's CodeWarrior, Microsoft Visual Studio,
> Wolfram Research's Mathematica
>

Perl's POD and Java's javadoc do it too.

> actually, i know Mathematica does, but i'm not quite sure about the
> other examples. So, my question is, does any one knows a language or
> IDE that actually allows the coder to manually highlight parts of the
> code and this highlight stick with the file upon reopening, as if a
> word processor?
>
> Xah
> x...@xahlee.org
> ∑ http://xahlee.org/
>
> Xah Lee wrote:
>> Tabs versus Spaces in Source Code
>> This post is archived at:
>> http://xahlee.org/UnixResource_dir/writ/tabs_vs_spaces.html
>

I'm slowly moving into the "spaces" camp. After reading your earlier
post on tabs vs. spaces and other people's responses, I began thinking
about why I like tabs so much, and there is only one answer--backspace.

If I use tabs, when I backspace I go back to the previous tab position,
which is what I want. With spaces, I have to hit the backspace key
several times to get back. That's it--one feature is the only reason I
like tabs, so I decided to investigate vim's features to see if vim
would let me backspace to the previous tab position with one keystroke.

'Softtabstop' (sts) is the feature. I would have never thought to look
for this feature without your post. Thanks again Xah.

Your posts are on topic, informative, engaging and necessary. Keep them
coming Xah. :)

Oliver Wong

unread,
May 23, 2006, 11:14:03 AM5/23/06
to

"Jonathon McKitrick" <j_mck...@bigfoot.com> wrote in message
news:1147963328....@j33g2000cwa.googlegroups.com...

What OS are you using? In Windows XP, you'd have to let the XP know that
you're interested in input in languages other than English via "Control
Panel -> Regional Settings -> Languages -> Text Services and Input
Languages". There, you'd add input methods other than English. Each "input
method" works in a sort of unique way, so you'll just have to learn them.
For example, under English, you can use the "keyboard" input method which
probably is what you're using now, or the "handwriting recognition" input
method, or the "speech recognition" input method to insert english text.
There are other input methods for the Asian languages (e.g. Chinese,
Japanese, etc.)

- Oliver

Ben Rudiak-Gould

unread,
May 23, 2006, 1:35:53 PM5/23/06
to
Mumia W. wrote:
> Xah Lee wrote:
>> (__)
>> (oo)
>> /-------\/
>> / | ||
>> * ||----||
>> ~~ ~~
>
> I converted your little doggie to and from text with tab
> sizes of eight, and he survived. (I did it with tabs set to four too,
> and it worked.)

It's a cow, actually. But I only know that because of cultural metadata.

(Or did you mean "little dogie"?)

-- Ben

0 new messages