I'm sometimes using \scantokens, and it is a very useful primitive, but I
always have problems with the end of file. I often use the following :
\def\@def@temp#1\@nil{%
\def\@temp{#1}}
\everyeof{\@nil}
\expandafter\@def@temp\scantokens{foobar}%
or (read at the end of tex82.bug) :
\everyeof{\noexpand}
\edef\@temp{\scantokens{foobar}}
But it allways end up with
> \@temp=macro:
->foobar .
Experimentally, I tried the following :
\expandafter\def\expandafter\@gobble@sp\space{}
\everyeof{\expandafter\@gobble@sp\noexpand}
\edef\@temp{\scantokens{foobar}}
\everyeof{\expandafter\@gobble\noexpand}
\edef\@temp{\scantokens{foobar}}
But I get errors ("use of \@gobble@sp doesn't match its defintion",
resp. "argument of \@gobble has an extra }") and I'm afraid I really don't
understand at all what is happenning with end of files.
Any idea (or precise reference in the TeXbook) ?
Manuel.
I think I finaly got it, sort of. The space comes form the end of line at
the end of the file. Just say \endlinechar\m@ne before, and everything's
ok. But there are some points I still don't understand...
mpg wrote:
> \everyeof{\noexpand}
> \edef\@temp{\scantokens{foobar}}
>
How does this \noexpand trick work exactly ?
> Experimentally, I tried the following :
>
> \expandafter\def\expandafter\@gobble@sp\space{}
> \everyeof{\expandafter\@gobble@sp\noexpand}
> \edef\@temp{\scantokens{foobar}}
>
> \everyeof{\expandafter\@gobble\noexpand}
> \edef\@temp{\scantokens{foobar}}
>
I also just tried
\expandafter\def\expandafter\@gobble@sp\space{}
\everyeof{\@gobble@sp}
\edef\@temp{\scantokens{foobar}}
But it doesn't work either ("File ended while scanning use of
\@gobble@sp."). I'm still stuck : where exactly is the endlinechar
inserted ? Before or after the end of file ?
Manuel.
First, as you know already, \scantokens acts very much
like a cycle of \write and \input.
When TeX reaches the end of an input file, it inserts an
"outer" token, whose expansion generates the error:
! File ended while scanning definition of...
Ending the input file with \noexpand (or using eTeX's
\everyeof) makes this token act like \relax, and so
do nothing.
Donald Arseneau as...@triumf.ca
The fact that the \everyeof{\noexpand} neutralizes the EOF
proves that EOF is inserted immediately after \everyeof tokens,
though it is unclear exactly where those are inserted. My own
expectation would be that it is after the whole line (including
its \endlinechar) is tokenized. Thus, your code would produce
the equivalent of
foobar<end of line>
\@goble@sp<EOF token>
Perhaps you can use
\everyeof{\noexpand}
\edef\@temp{\scantokens{foobar\empty}}
The end of line character is ignored after tokenizing \empty
(provided the catcode is 5) and the expansion of \empty is,
of course, empty.
Donald A. already explained how the \everyeof{\noexpand}
works. It is (in effect) the same as the \noexpand trick
for putting an outer token inside an \edef:
\outer\def\x{foo}
\edef\y{\noexpand\x bar}.
Dan
> Donald A. already explained how the \everyeof{\noexpand}
> works. It is (in effect) the same as the \noexpand trick
> for putting an outer token inside an \edef:
> \outer\def\x{foo}
> \edef\y{\noexpand\x bar}.
Sorry for asking stupid questions, but
a) Which token is used as EOF-token?
b) If that token does nothing but act like \relax due to \noexpand,
why does it not wind up unexpanded inside the edeffed
definition like any other non-expanded outer-token?
\outer\def\funnyouterthingie{foo}
\edef\testA{\noexpand\funnyouterthingie}
\show\testA
\edef\testB{\scantokens{\noexpand}}
\show\testB
\bye
Ulrich
> \everyeof{\noexpand}
> \edef\@temp{\scantokens{foobar}}
>
> But it allways end up with
>
> > \@temp=macro: ->foobar .
The additional space is due to \endlinechar getting inserted
after every line of input while you have delivered one line of
input via \scantokens.
To make confusion complete: You can deliver more lines of
input and thus obtain more \endlinechar-characters via
\scantokens by "feeding" \newlinechar to \scantokens...
By the way: Heiko Oberdiek showed a nice trick on how
to define a macro from the contents of an input-file
without the need of total expansion.
Sincerely
Ulrich
% ---------------------------------------------------------------
%
% Macro \inputfiledef for unexpanded importing of file-content
% as macro-definition.
%
% Credits:
%
% The idea/trick/gist on which this macro is based, was presented
% by Heiko Oberdiek in Message-ID: <f1f538$nrl$1...@news.BelWue.DE>
% http://groups.google.de/group/comp.text.tex/msg/ed31601012d7b1c2?dmode=source
%
% ---------------------------------------------------------------
% write file barfoo.tex, needed for \input-example below:
%
\begingroup
\immediate\openout 1 barfoo.tex
\immediate\write 1 {barfoo}
\immediate\closeout 1
\endgroup
% ---------------------------------------------------------------
% define the macro \inputfiledef:
%
\catcode`\@=11
\def\@inputfiledef#1#2\ENDMARKER{%
\endgroup
\def#1{#2}% Or \edef or whatever def-fing variant you prefer.
\noexpand
}%
\def\inputfiledef#1#2{%
\begingroup
\everyeof{\ENDMARKER}%
\endlinechar=-1
\expandafter\@inputfiledef\expandafter#1#2%
}%
% ---------------------------------------------------------------
% restrictions/known issues related to the macro \inputfiledef:
%
% - catcode-1- and catcode-2-characters within files must be
% balanced.
% - The token \ENDMARKER must not occur within the files
%
% ---------------------------------------------------------------
% usage-example of the macro \inputfiledef:
%
\inputfiledef\@tempa{\scantokens{foobar}}%
\show\@tempa
\inputfiledef\@tempb{\input barfoo.tex}%
\show\@tempb
\bye
> To make confusion complete: You can deliver more lines of
> input and thus obtain more \endlinechar-characters via
> \scantokens by "feeding" \newlinechar to \scantokens...
>
Funnier, indeed...
> By the way: Heiko Oberdiek showed a nice trick on how
> to define a macro from the contents of an input-file
> without the need of total expansion.
>
Thanks for the hint.
Manuel.
> Sorry for asking stupid questions, but
> a) Which token is used as EOF-token?
> b) If that token does nothing but act like \relax due to \noexpand,
> why does it not wind up unexpanded inside the edeffed
> definition like any other non-expanded outer-token?
>
Same question here.
Precisely, is EOF a token, or merely some state of TeX's reading process
(like the three states N, M, S described at the end of the chapter 8 of the
TeXbook) ?
Manuel.
> Donald A. already explained how the \everyeof{\noexpand}
> works. It is (in effect) the same as the \noexpand trick
> for putting an outer token inside an \edef:
> \outer\def\x{foo}
> \edef\y{\noexpand\x bar}.
>
By the way, I didn't know this trick. Thanks.
Manuel.
> % Macro \inputfiledef for unexpanded importing of file-content
> % as macro-definition.
Also see package `catchfile'.
Yours sincerely
Heiko <ober...@uni-freiburg.de>
I assume it is simply some pointer in TeX the program and never
makes it to the user's level (something like the token
TeX calls \inaccessible when it reports an error after
\def{...}
Knuth calls both _special_ tokens. Both are there so that TeX
can proceed with its token processing where otherwise it
would have to abort. They are not supposed to be accessible.
One could ask the philosophical question: what is a token?
Essentially it is just a data structure that the program uses
to decide the next course of action. Presumably the EOF
token has no real data: it is just used once for the error
message and discarded. The \inaccessible token could be
the same (just used to enable the gobbling of the definition
text) or it could actually be defined (but be inaccessible to
the user).
> > b) If that token does nothing but act like \relax due to \noexpand,
> > why does it not wind up unexpanded inside the edeffed
> > definition like any other non-expanded outer-token?
Even a normal outer token merely *acts* like \relax after \noexpand
but isn't exactly:
\outer\def\x{x}
\edef\y{\expandafter\ifx\noexpand\x\relax Relaxed.\else Not
relaxed.\fi}
\show\y
produces
Not relaxed.
I like to think of tokens as having various bits set (=1) or not set
(=0).
I imagine \noexpand turning off the expansion bit (to which the outer
character is linked). As there are no other actions associated with a
macro token, it behaves like \relax temporarily, but keeps its own
name.
Because it is a special token, EOF could simply have no name,
to prevent it from being accessed by a user. \noexpand-ing it
could then simply leave nothing.
>
> Precisely, is EOF a token, or merely some state of TeX's reading process
> (like the three states N, M, S described at the end of the chapter 8 of the
> TeXbook) ?
Knuth calls it a token, so I presume it is coded like one.
Still, Knuth uses metaphor a lot in describing TeX. One
would have to check out the actual code (the book
"TeX, the Program", or the source, tex.web) to see for sure.
As the concept of outerness is associated with tokens, it
would make sense to share that code.
Dan
Not stupid at all!
a) Good question! Looking in tex.web there is no token declared in
section 222, where others like \endtemplate and \inaccessible are
defined, nor in section 780 where others are given names. Moreover,
section 362 does not insert any token at the end of the input file,
but instead executes "check_outer_validity", which is itself the test
"if scanner_status<>normal".
On the experimental side, if you test with \everyeof{\let\next=}, then
you will find that \next becomes the first token following the \input
file.
Therefore, my current understanding is that TeX does not insert any
token at the EOF. Does Knuth explicitly say it does? All I found in
The teXbook is "The end of an input file is also considered to be
\outer in this sense".
So how does \noexpand help? In sec 369 (of tex.web) \noexpand is
defined, and the statement there is: "Since \outer macros might arise
here, we must also clear the scanner_status temporarily". That
explains it! Using \noexpand causes TeX to think it is in normal
typesetting mode temporarily, while it gets the next token. If that
next token is beyond the end of file, then TeX doesn't complain when
the file ends.
Where else does TeX set "scanner_status:=normal"? There's section 470
for \string and \meaning, and they both work to suppress the error
message, but they have more serious side-effects than \noexpand. Then
there are sections 473 and 482 where "scan_toks" and "read_toks" read
token lists, but that use is restoring normalcy after scanning (after
the "}"). Section 507, the code for \ifx, looks promising, but that
makes two tokens disappear and requires a following \fi. The final
case deals with scanning alignment preambles, and that gets way too
messy.
Donald Arseneau as...@triumf.ca
>
> Not stupid at all!
>
> a) Good question! Looking in tex.web there is no token declared in
> section 222, where others like \endtemplate and \inaccessible are
> defined, nor in section 780 where others are given names. Moreover,
> section 362 does not insert any token at the end of the input file,
> but instead executes "check_outer_validity", which is itself the test
> "if scanner_status<>normal".
Note that, if the test is true, before signalling an error TeX executes
<@Backup an outer control sequence so that it can be reread@>
There is a comment that says that EOF is characterized by the condition
cur_cs =0,
and there is no token that will be marked as to be read again.
Dan asks
> One could ask the philosophical question: what is a token?
Current token is a mixture of cur_cs, cur_cmd, cur_chr that can be packed
into cur_tok (and a token list contains quantities such as cur_tok).
A token is a character token if and only if cur_cs =0 (case where
cur_cmd is the category code),
In all other case (included active characters), it is a pointer to the
table of equivalents from which
cur_cmd can be obtained.
When TeX reads a token, it clears all values, sets cur_cs if a control
sequence is found.
cur_cmd, cur_chr in case of a character (when reading from a token list,
it unpacks the token).
In some cases, TeX calls check_outer_validity if cur_cmd>outer_call
(this can happen
only if cur_cs>0, since for a character we have cmd_cmd<16). If TeX
cannot read a token
because of EOF, it calls check_outer_validity (cur_cs is still zero).
This case cannot be confused with the preceding one.
This is called optimisation: you trade some lines of code against a
piece of dynamic memry.
Jose'