Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Wanted: TeX macro to merge two lists, bit like Python's zip.

65 views
Skip to first unread message

Jonathan Fine

unread,
Jun 1, 2010, 10:39:32 AM6/1/10
to
Hello

I need a TeX macro that will merge two lists, something like Python's zip.
$ python
>>> zip('abc', 'def')
[('a', 'd'), ('b', 'e'), ('c', 'f')]

Anyone know of anything that does something like this?

--
Jonathan

Philipp Stephani

unread,
Jun 1, 2010, 2:59:06 PM6/1/10
to
Jonathan Fine <J.F...@open.ac.uk> writes:

Are these list simple lists of tokens?

\def\first{abc}
\def\second{def}
\def\xmerge#1#2{%
\expandafter\expandafter\expandafter\merge
\expandafter\expandafter\expandafter{\expandafter#1\expandafter}%
\expandafter{#2}%
}
\def\merge#1#2{%
\ifx&#1&\else\ifx&#2&\else
\mergeA#1\nil#2\nil
\fi\fi
}
\def\mergeA#1#2\nil#3#4\nil{%
(#1,#3)%
\ifx&#2&\else\ifx&#4&\else
;\mergeA#2\nil#4\nil
\fi\fi
}
\merge{abc}{def}\par
\xmerge\first\second\par
\bye

--
Change “LookInSig” to “tcalveu” to answer by mail.

Jonathan Fine

unread,
Jun 2, 2010, 4:25:26 AM6/2/10
to
Philipp Stephani wrote:
> Jonathan Fine <J.F...@open.ac.uk> writes:
>
>> Hello
>>
>> I need a TeX macro that will merge two lists, something like Python's zip.
>> $ python
>> >>> zip('abc', 'def')
>> [('a', 'd'), ('b', 'e'), ('c', 'f')]
>>
>> Anyone know of anything that does something like this?
>
> Are these list simple lists of tokens?

At present the items in each list are, for example,
\itemA{<anything>}
\itemB{<anything>}
but that can be changed.

> \def\first{abc}
> \def\second{def}
> \def\xmerge#1#2{%
> \expandafter\expandafter\expandafter\merge
> \expandafter\expandafter\expandafter{\expandafter#1\expandafter}%
> \expandafter{#2}%
> }
> \def\merge#1#2{%
> \ifx&#1&\else\ifx&#2&\else
> \mergeA#1\nil#2\nil
> \fi\fi
> }
> \def\mergeA#1#2\nil#3#4\nil{%
> (#1,#3)%
> \ifx&#2&\else\ifx&#4&\else
> ;\mergeA#2\nil#4\nil
> \fi\fi
> }
> \merge{abc}{def}\par
> \xmerge\first\second\par
> \bye

Thank you, Philipp, for this. I don't know if it will work in my
situation. (At present it 'typesets' the merge of the two lists rather
than 'returns' the merged list.) I'm also a bit bothered by the
quadratic running time.

--
Jonathan

David Kastrup

unread,
Jun 2, 2010, 5:12:59 AM6/2/10
to
Jonathan Fine <J.F...@open.ac.uk> writes:

> Philipp Stephani wrote:
>> Jonathan Fine <J.F...@open.ac.uk> writes:
>>
>>> Hello
>>>
>>> I need a TeX macro that will merge two lists, something like Python's zip.
>>> $ python
>>> >>> zip('abc', 'def')
>>> [('a', 'd'), ('b', 'e'), ('c', 'f')]
>>>
>>> Anyone know of anything that does something like this?
>>
>> Are these list simple lists of tokens?
>
> At present the items in each list are, for example,
> \itemA{<anything>}
> \itemB{<anything>}
> but that can be changed.

[...]

> Thank you, Philipp, for this. I don't know if it will work in my
> situation. (At present it 'typesets' the merge of the two lists
> rather than 'returns' the merged list.) I'm also a bit bothered by
> the quadratic running time.

TeX can only process adjacent tokens in constant time, so quadratic
running time is the best you can hope for. There is a large bag of
tricks to work with non-adjacent tokens that provide for considerable
leeway in the constant factor you have to live with.

I have not actually checked the presented code, but I would not be
surprised if its behavior was actually worse than O(n^2): it usually
requires a lot of effort to even reach O(n^2) with algorithms
implemented in TeX. This is often masked by TeX's efficiency that keeps
the constant factor small enough to escape notice for a long time.

--
David Kastrup
UKTUG FAQ: <URL:http://www.tex.ac.uk/cgi-bin/texfaq2html>

Jonathan Fine

unread,
Jun 2, 2010, 5:59:52 AM6/2/10
to
David Kastrup wrote:
> Jonathan Fine <J.F...@open.ac.uk> writes:

>>>> I need a TeX macro that will merge two lists, something like Python's zip.
>>>> $ python
>>>> >>> zip('abc', 'def')
>>>> [('a', 'd'), ('b', 'e'), ('c', 'f')]

> TeX can only process adjacent tokens in constant time, so quadratic


> running time is the best you can hope for.

Can LuaTeX do any better?

--
Jonathan

David Kastrup

unread,
Jun 2, 2010, 7:13:23 AM6/2/10
to
Jonathan Fine <J.F...@open.ac.uk> writes:

I should think so. However, my original statement is not entirely
accurate since you can, of course, just assign tokens one by one to
successive token registers and recall them in arbitrary order.

However, in particular without using eTeX, token registers are likely to
run out faster than you will notice an advantage of this O(n)
implementation over good O(n^2) ones.

Doing the equivalent of array access using \csname
...\number...\endcsname instead looks like O(n) when not looking too
closely, but you'll get a large number of hash collisions, and the
resulting slowdown impacts TeX even after your algorithm has finished.

In contrast, Lua arrays are rather efficient, and they don't have
lasting performance implications once you delete them again.

luigi scarso

unread,
Jun 2, 2010, 7:54:22 AM6/2/10
to

\directlua {
function ziptwostrings(a,b)
tex.sprint('[')
for i=1,math.max(\string#a,\string#b) do
tex.sprint("('",string.sub(a,i,i),"','",string.sub(b,i,i),"')")
end
tex.sprint(']')
end
}


\def\zip#1#2{\directlua{ziptwostrings('#1','#2')}}

\zip{abc}{def}\par
\zip{ab}{cdef}\par
\zip{abcd}{ef}\par


Disclaimer:
I'm sure that I've copied it from somewhere sometime ago, so it's not
mine
(maybe Hans?)

--
luigi

Jonathan Fine

unread,
Jun 2, 2010, 8:30:42 AM6/2/10
to

Thank you for this, Luigi. I want a function that will merge two lists,
and you've given me a function that will merge two strings. But it
won't merge two lists. (Partly my fault, because I gave the simplest
possible example of what Python's zip does.

By the way, for me your code does not work if one of the strings
contains a quote mark:

This is LuaTeX, Version beta-0.40.5-2009101820 (Web2C 2009)

*\message{\zip{a}{'}}
! LuaTeX error <\directlua >:1: unfinished string near '<eof>'.
\zip #1#2->\directlua {ziptwostrings('#1','#2')}

<*> \message{\zip{a}{'}
}
?

--
Jonathan

Ulrich D i e z

unread,
Jun 2, 2010, 12:41:39 PM6/2/10
to
Jonathan Fine wrote:

Probably \MergeUndelimitedArgumentLists in the example below
does almost what you need.

"almost" - instead of commas and parentheses, braces will be used:

\MergeUndelimitedArgumentLists{abc}{def} yields
{{a}{d}}%
{{b}{e}}%
{{c}{f}}

Google-group-users/web-interface-users please make sure that the
posting is shown in "original format".


Sincerely

Ulrich

\errorcontextlines=10000
%%
%% This (not so) minimal example was written in June 2, 2010
%% by Ulrich Diez (eu_an...@web.de)
%%
%% There is no warranty - neither for probably included
%% documentation nor for any other part/component of this work/
%% of this (not so) minimal example.
%% If something breaks, you usually may keep the pieces.
%%
\documentclass{article}
\makeatletter
%%---------------------------------------------------------------
%% Stopper for \romannumeral-Expansion:
%%...............................................................
\newcommand\@rmstop{0 }
%%---------------------------------------------------------------
%% Check if argument is null/blank:
%%...............................................................
%% The \@ifnull-macro is derived from
%% Robert R Schneck's \ifempty-macro
%% [ news:3eef1...@corp.newsgroups.com
%% Newsgroups: comp.text.tex,
%% Subject:
%% Re: \ifempty solution (was Macro puzzle: maximally general \ifempty)
%% Message-ID: <3eef1...@corp.newsgroups.com>
%% Date: 17 Jun 2003 08:42:50 -0500 ].
\newcommand\@ifnull[1]{%
\romannumeral\expandafter\@firstofone\expandafter{\expandafter
\expandafter\expandafter\@rmstop\csname @\expandafter\@gobble\string{%
\expandafter\@secondoftwo\expandafter{\expandafter{\string#1}%
{}\expandafter\expandafter\expandafter\@gobble\expandafter
\@gobble\string}{}\expandafter\expandafter\expandafter
\@firstoftwo\expandafter\expandafter\expandafter{%
\expandafter\@gobble\string}second}{first}oftwo\endcsname}%
}
\newcommand\@ifblank[1]{%
\romannumeral\expandafter\expandafter\expandafter\@gobble
\expandafter\@ifnull\expandafter{\@gobble#1.}%
}
%%---------------------------------------------------------------
%% \extractfirstlistarg{<action>}%
%% {<action if list empty/blank>}%
%% {{<e_k>}{<e_(k+1)>}..{<e_n>}}%
%%
%% -> either: <action if list empty>
%% or: <action>{<e_k>}{{<e_(k+1)>}..{<e_n>}}%
%%...............................................................
\newcommand\PassFirstToSecond[2]{#2{#1}}
\@ifdefinable\@carbrace{%
\long\def\@carbrace#1#2\@nil{{#1}}%
}
\newcommand\extractfirstlistarg[3]{%
\csname @firstofone%
\@ifblank{#3}%
{\endcsname{#2}}%
{%
\expandafter\expandafter
\expandafter \@extractfirstlistargloop
\expandafter\PassFirstToSecond
\expandafter{%
\@gobble#3}{{#3\@nil}}{#1}%
}%
}
\newcommand\@extractfirstlistargloop[3]{%
\expandafter\@ifnull\expandafter{\@gobble#1}%
{\endcsname{#3}#1{#2}}%
{%
\expandafter\@extractfirstlistargloop
\expandafter{%
\@carbrace#1}{#2}{#3}%
}%
}
%%---------------------------------------------------------------
%% \MergeUndelimitedArgumentLists{{<e_1>}{<e_2>}..{<e_m>}}%
%% {{<f_1>}{<f_2>}..{<f_n>}}
%%
%% -> {{<e_1>}{<f_1>}}{{<e_2>}{<f_2>}} .. {{<e_n>}{<f_n>}}
%%
%% \MergeUndelimitedArgumentLists merges two lists of undelimited
%% arguments into a list of "2-argument-tuples".
%%
%% In case lists are not of equal length, one component of
%% the argument-tuple will either come from the top-level-expansion
%% of \FirstListDefault or from the top-level-expansion of
%% \SecondListDefault for those elements of one list which don't
%% have a counterpart in the other list.
%%
%% \MergeUndelimitedArgumentLists is expandable/can be used
%% in \edef and - due to \romannumeral-expansion - delivers
%% the result after two expansion-steps.
%%
%% Iteration terminates depending on emptiness/blankness of
%% arguments, not on the definition of some termination-marker
%% and thus cannot be terminated erroneously by some token
%% whose meaning would equal that of a list-termination-marker.
%% There is only one internal macro (\@carbrace) where delimited
%% arguments are used, thus you can safely use any non-outer-token
%% inside the two list-arguments.
%%
%% Be aware that space-tokens will be discarded if they separate
%% undelimited macro-arguments.
%%---------------------------------------------------------------
\newcommand\MergeUndelimitedArgumentLists[2]{%
\romannumeral\@MergeUndelimitedArgumentLists{#1}{#2}{\@rmstop}%
}%
\newcommand\@@MergeUndelimitedArgumentLists[5]{%
\@MergeUndelimitedArgumentLists{#4}{#2}{#5{{#3}{#1}}}
}%
\newcommand\@@@MergeUndelimitedArgumentLists[3]{%
\@@MergeUndelimitedArgumentLists{#2}{#3}{#1}{}%
}%
\newcommand\@MergeUndelimitedArgumentLists[2]{%
\extractfirstlistarg{%
\extractfirstlistarg{%
\@@MergeUndelimitedArgumentLists
}{%
\expandafter\@@MergeUndelimitedArgumentLists
\expandafter{\SecondListDefault}{}%
}{#2}%
}{%
\extractfirstlistarg{%
\expandafter\@@@MergeUndelimitedArgumentLists
\expandafter{\FirstListDefault}%
}{\@firstofone}{#2}%
}{#1}%
}%
\newcommand*\FirstListDefault{?}
\newcommand*\SecondListDefault{!}
\makeatother


\begin{document}

\ttfamily\selectfont\frenchspacing
\parindent\csname z@\endcsname
\parskip .66\baselineskip


\verb|\MergeUndelimitedArgumentLists{ABCDE}{12345}|:\newline
\expandafter\expandafter\expandafter\def
\expandafter\expandafter\expandafter\Merged
\expandafter\expandafter\expandafter{%
\MergeUndelimitedArgumentLists{ABCDE}{12345}%
}
\meaning\Merged
\newline
text:->\MergeUndelimitedArgumentLists{ABCDE}{12345}


\verb|\MergeUndelimitedArgumentLists{ABC}{12345}|:\newline
\expandafter\expandafter\expandafter\def
\expandafter\expandafter\expandafter\Merged
\expandafter\expandafter\expandafter{%
\MergeUndelimitedArgumentLists{ABC}{12345}%
}
\meaning\Merged
\newline
text:->\MergeUndelimitedArgumentLists{ABC}{12345}


\verb|\MergeUndelimitedArgumentLists{ABCDE}{123}|:\newline
\expandafter\expandafter\expandafter\def
\expandafter\expandafter\expandafter\Merged
\expandafter\expandafter\expandafter{%
\MergeUndelimitedArgumentLists{ABCDE}{123}%
}
\meaning\Merged
\newline
text:->\MergeUndelimitedArgumentLists{ABCDE}{123}

\end{document}

luigi scarso

unread,
Jun 2, 2010, 3:54:38 PM6/2/10
to
I believe that python's lists can be emulated by lua's tables
but can be a bit complicated implement the immutability property

>
> By the way, for me your code does not work if one of the strings
> contains a quote mark:
Hm, we should be use lua's [[..]] strings then

>
> This is LuaTeX, Version beta-0.40.5-2009101820 (Web2C 2009)
>
> *\message{\zip{a}{'}}
> ! LuaTeX error <\directlua >:1: unfinished string near '<eof>'.
> \zip #1#2->\directlua {ziptwostrings('#1','#2')}
>
> <*> \message{\zip{a}{'}
>                         }
> ?
>
> --
> Jonathan

This is by H.Hagen: it's context mkiv

\directlua {
function ziptwowhatever(a,b)


tex.sprint('[')
for i=1,math.max(\string#a,\string#b) do
tex.sprint(
"('",

(type(a) == "table" and (a[i] or "")) or string.sub(a,i,i),
"','",
(type(b) == "table" and (b[i] or "")) or string.sub(b,i,i),


"')"
)
end
tex.sprint(']')
end
}

\def\zip#1#2{\directlua{ziptwowhatever([[#1]],[[#2]])}}

\starttext

\zip{abc}{def}\par
\zip{ab}{cdef}\par
\zip{abcd}{ef}\par

\directlua {
ziptwowhatever({"a1","a2","a3"},{"b1","b2","b3","b4"})
}

\stoptext

--
luigi

Heiko Oberdiek

unread,
Jun 2, 2010, 6:32:46 PM6/2/10
to
Jonathan Fine <J.F...@open.ac.uk> wrote:

> luigi scarso wrote:

> > \def\zip#1#2{\directlua{ziptwostrings('#1','#2')}}

> By the way, for me your code does not work if one of the strings

> contains a quote mark:
>
> This is LuaTeX, Version beta-0.40.5-2009101820 (Web2C 2009)
>
> *\message{\zip{a}{'}}
> ! LuaTeX error <\directlua >:1: unfinished string near '<eof>'.
> \zip #1#2->\directlua {ziptwostrings('#1','#2')}

LuaTeX provides \luaescapestring for this reason:

| 2.6.3 \luaescapestring
| This primitive converts a TEX token sequence so that it
| can be safely used as the contents of a Lua string:
| embedded backslashes, double and single quotes,
| and newlines and carriage returns are escaped. [...]

--
Heiko Oberdiek

Jonathan Fine

unread,
Jun 3, 2010, 11:40:00 AM6/3/10
to
Ulrich D i e z wrote:
> Jonathan Fine wrote:
>
>> I need a TeX macro that will merge two lists, something like Python's zip.
>> $ python
>> >>> zip('abc', 'def')
>> [('a', 'd'), ('b', 'e'), ('c', 'f')]
>>
>> Anyone know of anything that does something like this?
>
> Probably \MergeUndelimitedArgumentLists in the example below
> does almost what you need.

[about 30 lines of ingenious TeX macros snipped]

Thank you for this, Ulrich. However, I won't be using it, for a number
of reasons. (Each reason by itself is not enough, but taken together
they produce my decision.)

1. By the time you posted your code, I had written my own
problem-specific solution.

2. If I used your code I'd also have to support it (as it's not a
package on CTAN yet).

3. Python's zip stops on the shorter list, which is what I want. Your
code continues.

4. In Python you can write
a = zip('adc', 'def')
I was wanting a TeX equivalent that 'makes an assignment', say with syntax
\zipset \a {abc} {def}

5. My lists have an application specific (and quite sensible there)
form. To use your code, Ulrich, I'd have to adapt it or change my own.


If there was already a published working solution that fitted in with my
needs I certainly would have wanted to use it.

If anyone is interested in exploring this topic further please do post.
It's something I've got time and energy for.

--
Jonathan

Jonathan Fine

unread,
Jun 3, 2010, 11:48:30 AM6/3/10
to


Thank you for this, Heiko. This will certainly improve string handling.

Does anyone know if my original problem can be solved. To make it more
explicit

\def\setzip #1#2#3{%
% Solution goes here.
}

\setzip\actual {\zero\two\four}{\one\three}
\def\result{\zero\one\two\three}
\ifx\actual\expect\else\ddt\fi

--
Jonathan

vibrovski

unread,
Jun 3, 2010, 1:53:36 PM6/3/10
to
> Does anyone know if my original problem can be solved.  To make it more
> explicit
>
>     \def\setzip #1#2#3{%
>         % Solution goes here.
>     }
>
>     \setzip\actual {\zero\two\four}{\one\three}
>     \def\result{\zero\one\two\three}
>     \ifx\actual\expect\else\ddt\fi
>

Maybe there is a better or more efficient way of doing it, but how
about:

\documentclass{article}

\begin{document}

\makeatletter

\def\@zip@empty{}
\def\@zip@stop{\@zip@stop}

\def\ziplists#1#2#3{%
\let#1=\@zip@empty%
\@ziplists{#1}#2\@zip@stop\@zip@@stop#3\@zip@stop\@zip@@stop
\@zip@@@stop%
}

\def\@ziplists#1#2#3\@zip@@stop#4#5\@zip@@stop{%
\def\@zip@temp@a{#2}%
\def\@zip@temp@b{#4}%
\ifx\@zip@temp@a\@zip@stop%
\ifx\@zip@temp@b\@zip@stop%
\let\@zip@next=\@@ziplists%
\else%
\expandafter\def\expandafter#1\expandafter{#1#4}%
\def\@zip@next{\@ziplists{#1}\@zip@stop\@zip@@stop#5\@zip@@stop}%
\fi%
\else%
\ifx\@zip@temp@b\@zip@stop%
\expandafter\def\expandafter#1\expandafter{#1#2}%
\def\@zip@next{\@ziplists{#1}#3\@zip@@stop\@zip@stop\@zip@@stop}%
\else%
\expandafter\def\expandafter#1\expandafter{#1#2#4}%
\def\@zip@next{\@ziplists{#1}#3\@zip@@stop#5\@zip@@stop}%
\fi%
\fi%
\@zip@next}

\def\@@ziplists#1\@zip@@@stop{}

\makeatother

\ziplists\x{\zero\two\four}{\one\three}

\show\x

\end{document}

Regards

Mark

vibrovski

unread,
Jun 3, 2010, 1:57:48 PM6/3/10
to


Actually I have a feeling that's not quite you wanted...

Regards

Mark

Heiko Oberdiek

unread,
Jun 3, 2010, 5:35:12 PM6/3/10
to
Jonathan Fine <J.F...@open.ac.uk> wrote:

> Does anyone know if my original problem can be solved. To make it more
> explicit
>
> \def\setzip #1#2#3{%
> % Solution goes here.
> }
>
> \setzip\actual {\zero\two\four}{\one\three}
> \def\result{\zero\one\two\three}
> \ifx\actual\expect\else\ddt\fi

The following implementation works for that kind of tokens.

\def\setzip#1#2#3{%
\begingroup
\toks0={}%
\def\x{#2}%
\ifx\x\empty
\else
\def\x{#3}%
\ifx\x\empty
\else
\setzipi#2\NIL#3\NIL
\fi
\fi
\edef\x{\endgroup
\def\noexpand#1{\the\toks0}%
}%
\x
}
\def\setzipi#1#2\NIL#3#4\NIL\fi\fi{%
\fi\fi
\toks0=\expandafter{\the\toks0 #1#3}%
\def\x{#2}%
\ifx\x\empty
\else
\def\x{#4}%
\ifx\x\empty
\else
\setzipi#2\NIL#4\NIL
\fi
\fi
}

\def\test#1#2#3{%
\setzip\actual{#1}{#2}%
\def\result{#3}%
\ifx\actual\result
\else
\immediate\write16{ERROR}%
\immediate\write16{result: \meaning\result}%
\immediate\write16{actual: \meaning\actual}%
\fi
}

\test{\zero\two\four}{\one\three}{\zero\one\two\three}
\test{}{}{}
\test{\zero}{}{}
\test{}{\zero}{}

\csname @@end\endcsname\end


Refinement, if the tokens can be \if/\else/fi tokens:

\def\setzip#1#2#3{%
\begingroup
\toks0={}%
\def\x{#2}%
\def\y{#3}%
\ifx\x\empty
\else
\ifx\y\empty
\else
\expandafter\expandafter\expandafter\setzipi
\expandafter\x\expandafter\NIL\y\NIL
\fi
\fi
\edef\x{\endgroup
\def\noexpand#1{\the\toks0}%
}%
\x
}
\def\setzipi#1#2\NIL#3#4\NIL\fi\fi{%
\fi\fi
\toks0=\expandafter{\the\toks0 #1#3}%
\def\x{#2}%
\def\y{#4}%
\ifx\x\empty
\else
\ifx\y\empty
\else
\expandafter\expandafter\expandafter\setzipi
\expandafter\x\expandafter\NIL\y\NIL
\fi
\fi
}

\def\test#1#2#3{%
\setzip\actual{#1}{#2}%
\def\result{#3}%
\ifx\actual\result
\else
\immediate\write16{ERROR}%
\immediate\write16{result: \meaning\result}%
\immediate\write16{actual: \meaning\actual}%
\fi
}

\test{\zero\two\four}{\one\three}{\zero\one\two\three}
\test{}{}{}
\test{\zero}{}{}
\test{}{\zero}{}
\test{\fi\else\if}{\fi\else\if}{\fi\fi\else\else\if\if}

\csname @@end\endcsname\end

--
Heiko Oberdiek

Message has been deleted

vibrovski

unread,
Jun 4, 2010, 3:23:52 AM6/4/10
to

To stop on the shorter list change the definition of \@ziplists:

\def\@ziplists#1#2#3\@zip@@stop#4#5\@zip@@stop{%
\def\@zip@temp@a{#2}%
\def\@zip@temp@b{#4}%
\ifx\@zip@temp@a\@zip@stop%

\let\@zip@next=\@@ziplists%
\else%


\ifx\@zip@temp@b\@zip@stop%
\let\@zip@next=\@@ziplists%
\else%
\expandafter\def

\expandafter#1\expandafter{#1#2#4}%
\def\@zip@next{\@ziplists{#1}
#3\@zip@@stop#5\@zip@@stop}%
\fi%
\fi%
\@zip@next}

Although unlike Heiko's solution it fails if the lists contain \if
\else\fi.

Regards

Mark

Ulrich D i e z

unread,
Jun 4, 2010, 8:18:54 AM6/4/10
to
[ This posting supersedes my posting
news:hu9epa$mf$02$1...@news.t-online.com
as the macro \@ziplists provided therein can be shortened
without affecting functionality.
I don't know what I was thinking... ]


Jonathan Fine wrote in news:hu8icg$o9i$1...@south.jnrs.ja.net :

> 3. Python's zip stops on the shorter list, which is what I want. Your
> code continues.

Stopping on the shorter list is more-easily implementable
than continuing.

> 4. In Python you can write
> a = zip('adc', 'def')
> I was wanting a TeX equivalent that 'makes an assignment', say with syntax
> \zipset \a {abc} {def}

The code provided by me is expandable, thus one can
easily use it inside another macro which serves as "wrapper"
for performing the assignment.

> 5. My lists have an application specific (and quite sensible there)
> form. To use your code, Ulrich, I'd have to adapt it or change my own.

[...]


> If anyone is interested in exploring this topic further please do post.
> It's something I've got time and energy for.

In case you provide exact specifications of your lists
and of what you expect as result of merging your lists,
I might also be interested in exploring this topic further.

Jonathan Fine wrote in news:hu8isf$oe5$1...@south.jnrs.ja.net

> Does anyone know if my original problem can be solved. To make it more
> explicit
>
> \def\setzip #1#2#3{%
> % Solution goes here.
> }
>
> \setzip\actual {\zero\two\four}{\one\three}
> \def\result{\zero\one\two\three}
> \ifx\actual\expect\else\ddt\fi

I changed the macros from my previous posting to stop
at the shorter list and not to insert braces.

Also I renamed stuff to \ziplists and added the macro \setzip.

Some questions:

- In the above - where does \expect come from? What is its definition?

- How shall list-element-surrounding braces be treated?

- How shall space-tokens between list-elements be treated?

E.g., what shall be the expansion of \actual in the following situations?

\setzip\actual {{\zero}\two\four}{\one\three}

\setzip\actual {{\zero}\two\four}{\one{{\three}}}

\setzip\actual {{\zero} \two \four}{\one {{\three}}}

- By now, in the example below, the macro \setzip
performs the assignment in terms of \def.
That might override already defined control-sequences.
What shall happen in case the first argument of \setzip
is an already-defined control-sequence?
Do you want error-checking in case the first argument
of \setzip does not at all consist of a token where
\def can be applied to? (e.g. several tokens / non-active
character-token. Expandably detecting whether a character
is active or not might be an interesting task in case the
character is active but currently let equal to one of its
non-active "pendants" ;-> )

Sincerely

Ulrich


\errorcontextlines=10000
%%
%% This (not so) minimal example was written in June 4, 2010

%% \ziplists{{<e_1>}{<e_2>}..{<e_m>}}%


%% {{<f_1>}{<f_2>}..{<f_n>}}
%%

%% -> <e_1><f_1><e_2><f_2>.. <e_m><f_m>
%%
%% \ziplists merges two lists of undelimited arguments.
%%
%% In case lists are not of equal length, iteration will
%% stop at the shorter list.
%%
%% \ziplists is expandable/can be used


%% in \edef and - due to \romannumeral-expansion - delivers
%% the result after two expansion-steps.
%%
%% Iteration terminates depending on emptiness/blankness of
%% arguments, not on the definition of some termination-marker
%% and thus cannot be terminated erroneously by some token
%% whose meaning would equal that of a list-termination-marker.
%% There is only one internal macro (\@carbrace) where delimited
%% arguments are used, thus you can safely use any non-outer-token
%% inside the two list-arguments.
%%
%% Be aware that space-tokens will be discarded if they separate
%% undelimited macro-arguments.

%%...............................................................
\newcommand\ziplists[2]{%
\romannumeral\@ziplists{#1}{#2}{\@rmstop}%
}%
\newcommand\@@ziplists[5]{%
\@ziplists{#4}{#2}{#5#3#1}%
}%
\newcommand\@ziplists[2]{%
\extractfirstlistarg{%
\extractfirstlistarg{%
\@@ziplists
}{%
\expandafter\@firstofone\@gobbletwo
}{#2}%
}{%
\@firstofone
}{#1}%
}%
%%---------------------------------------------------------------
%% \setzip<macro-token>%
%% {{<e_1>}{<e_2>}..{<e_m>}}%


%% {{<f_1>}{<f_2>}..{<f_n>}}
%%

%% -> defines <macro-token> to expand to
%% <e_1><f_1><e_2><f_2>.. <e_m><f_m>
%%...............................................................
\newcommand\setzip[3]{%
\romannumeral
\expandafter\expandafter\expandafter\@rmstop
\expandafter\expandafter\expandafter\def
\expandafter\expandafter\expandafter#1%
\expandafter\expandafter\expandafter{\ziplists{#2}{#3}}%
}%
\makeatother


\begin{document}

\ttfamily\selectfont\frenchspacing
\parindent\csname z@\endcsname
\parskip .66\baselineskip


\verb|\ziplists{ABCDE}{12345}|:\newline


\expandafter\expandafter\expandafter\def
\expandafter\expandafter\expandafter\Merged
\expandafter\expandafter\expandafter{%

\ziplists{ABCDE}{12345}%
}
\meaning\Merged
\newline
text:->\ziplists{ABCDE}{12345}


\verb|\ziplists{ABC}{12345}|:\newline


\expandafter\expandafter\expandafter\def
\expandafter\expandafter\expandafter\Merged
\expandafter\expandafter\expandafter{%

\ziplists{ABC}{12345}%
}
\meaning\Merged
\newline
text:->\ziplists{ABC}{12345}


\verb|\ziplists{ABCDE}{123}|:\newline


\expandafter\expandafter\expandafter\def
\expandafter\expandafter\expandafter\Merged
\expandafter\expandafter\expandafter{%

\ziplists{ABCDE}{123}%
}
\meaning\Merged
\newline
text:->\ziplists{ABCDE}{123}


\verb|\setzip\actual{\zero\two\four}{\one\three}|:\newline


\setzip\actual{\zero\two\four}{\one\three}

\string\actual: \meaning\actual
\newline


\def\result{\zero\one\two\three}

\ifx\actual\result\else\ddt\fi


\end{document}

Jonathan Fine

unread,
Jun 4, 2010, 10:09:07 AM6/4/10
to

I'd like to thank all who have contributed to this discussion. Given
the difficulties TeX has with token lists, and in particular the
inability to subscript (a[i]) a list, it seems that a good general
purpose solution to this problem might be of wide interest.

Here's perhaps the simplest form of the problem: Define a function such
that
\zipset\actual {\zero\two\four}{\one\three}
\def\expect{\zero\one\two\three}
\ifx\actual\expect\else\ddt\fi % Don't get \ddt.

Any TeX solution to this problem, unless it uses \csname ... \endcsname
or something similar to emulate an subscriptable list, will have at
least quadratic running time.

I'd like to know if LuaTeX can solve this problem in linear time: David
Kastrup thinks it can be done but working code hasn't been posted yet.

I'd also like to know if the LaTeX3 people are interested in solving
this (and similar) problems.

--
Jonathan

Joseph Wright

unread,
Jun 4, 2010, 12:05:12 PM6/4/10
to
On Jun 4, 3:09 pm, Jonathan Fine <J.F...@open.ac.uk> wrote:>
> I'd also like to know if the LaTeX3 people are interested in solving
> this (and similar) problems.

One of the LaTeX3 aims is to provide a good "toolkit" for programmers.
As you know, we have a set of some tools, and these are being used to
try to build higher-level constructs. At the same time, where there
are obvious gaps then filling them is at least on the "to do" list. Of
course, we can only ultimately do what TeX can do, which means that
the same solution for LaTeX3 will exist without it :-)

TeX programming doesn't always follow other languages: what is easy
in, say, Python may not be a good plan with TeX. So what constitutes a
"good toolkit" is not a fixed thing.

In the case at hand, could I ask what the wider context is. The reason
I ask is that there are a few possible ideas using the LaTeX3 stuff,
for example the mapping functions and property list ideas. However,
I'm not quite sure what is really the best to suggest as I don't know
the wider situation.

I'd point out that my LaTeX3 work tends to be very much on the "grind"
side: other people supply the clever TeX ideas!
--
Joseph Wright

Jonathan Fine

unread,
Jun 10, 2010, 5:58:20 AM6/10/10
to

Sorry Joseph, it's taken me a while to reply to this. (In the back of
my mind I was waiting for a response from the advocates of Lua.)

I'd quite like, as an exercise, to write with other a macro something like
\zipset\a\b\c
which defines \a to be the Pythonesque zip of macros \a and \b.

Example:
\def\b{\zero\two\four}
\def\c{\one\three}
\zipset\a\b\c
% equivalent to \def\a{\zero\one\two\three}

Although perhaps a toy problem I think this is close enough to being
part of a good toolkit to be interesting.

--
Jonathan

Joseph Wright

unread,
Jun 10, 2010, 6:20:21 AM6/10/10
to
On Jun 10, 10:58 am, Jonathan Fine <J.F...@open.ac.uk> wrote:
> Example:
>     \def\b{\zero\two\four}
>     \def\c{\one\three}
>     \zipset\a\b\c
>     % equivalent to \def\a{\zero\one\two\three}
>
> Although perhaps a toy problem I think this is close enough to being
> part of a good toolkit to be interesting.

Hello Jonathan,

First though for the example you give:

\documentclass{article}
\usepackage{expl3}
\ExplSyntaxOn
\cs_set_nopar:Npn \exp_last_unbraced:NNNV {
\::N \::N \::V_unbraced \:::
}
\tl_new:N \l_tl_zip_tmpa_tl
\tl_new:N \l_tl_zip_tmpb_tl
\cs_new:Npn \tl_zip:NNN #1#2#3 {
\tl_set_eq:NN \l_tl_zip_tmpa_tl #3
\tl_clear:N #1
\tl_map_inline:Nn #2
{
\tl_put_right:Nn #1 {##1}
\tl_if_empty:NF \l_tl_zip_tmpa_tl
{
\tl_put_right:Nx #1
{
\exp_last_unbraced:NNNV \exp_after:wN \exp_not:N
\tl_head:w \l_tl_zip_tmpa_tl \q_nil
}
\tl_set:Nx \l_tl_zip_tmpa_tl
{
\exp_last_unbraced:NNNV \exp_after:wN \exp_not:N
\tl_tail:w \l_tl_zip_tmpa_tl \q_nil
}
}
}
}
\cs_new:Npn \tl_zip_aux:n #1 { }


\def\b{\zero\two\four}
\def\c{\one\three}

\tl_zip:NNN \a \b \c
\tl_show:N \a
\ExplSyntaxOff
\begin{document}
\end{document}

I doubt it would be easy to do a Python-like function which didn't
need to know in advance the type of list to zip (so you'd need
\clist_zip:NNN for a comma-list, for example).
--
Joseph Wright

Joseph Wright

unread,
Jun 10, 2010, 7:05:04 AM6/10/10
to
On Jun 10, 11:20 am, Joseph Wright <joseph.wri...@morningstar2.co.uk>
wrote:

I realise that this would drop any "excess" in the second list: can be
fixed by adding

\tl_put_right:NV #1 \l_tl_zip_tmpa_tl

after the \tl_map_inline:Nn lines. Also, I think expl3 should probably
have a better way to add the first token to a list in an unexpanded
way: I'll add something like \exp_not_head:n and \exp_not_tail:n to
expl3 soon!
--
Joseph Wright

Message has been deleted

Ulrich D i e z

unread,
Jun 10, 2010, 8:02:31 AM6/10/10
to
[ This posting supersedes my posting
news:huqi65$5s0$00$1...@news.t-online.com
as the macros \setzip and \zipset provided therein can be

shortened without affecting functionality.
I don't know what I was thinking... ]

Jonathan Fine wrote:

> I'd quite like, as an exercise, to write with other a macro something like
> \zipset\a\b\c
> which defines \a to be the Pythonesque zip of macros \a and \b.

Where can I find specifications of "Pythonesque zip" ?

> Example:
> \def\b{\zero\two\four}
> \def\c{\one\three}
> \zipset\a\b\c
> % equivalent to \def\a{\zero\one\two\three}


The example below provides three macros:

1. \ziplists{{<e_1>}{<e_2>}..{<e_m>}}{{<f_1>}{<f_2>}..{<f_n>}}

-> expands to: <e_1><f_1><e_2><f_2>.. <e_m><f_m>

(\ziplists expandably merges two lists of undelimited arguments.)


2. \setzip<macro-token>{{<e_1>}{<e_2>}..{<e_m>}}{{<f_1>}{<f_2>}..{<f_n>}}

-> defines <macro-token> to expand to: <e_1><f_1><e_2><f_2>.. <e_m><f_m>

3. \zipset<macro-token 1><macro-token 2><macro-token 3>
while <macro-token 2> expands to: {{<e_1>}{<e_2>}..{<e_m>}}
and <macro-token 3> expands to: {{<f_1>}{<f_2>}..{<f_n>}}

-> defines <macro-token 1> to expand to <e_1><f_1><e_2><f_2>.. <e_m><f_m>


(Google-group-users/web-interface-users please make sure that the
posting is shown in "original format".)


Sincerely

Ulrich


\errorcontextlines=10000 %%
%% This (not so) minimal example was written in June 10, 2010

%% (Due to \romannumeral-expansion, "constructing" the assignment
%% which in turn goes into TeX' stomach takes two \expandafter-
%% chains/two expansion-steps.)


%%...............................................................
\newcommand\setzip[3]{%

\romannumeral\expandafter\@rmstop
\expandafter\def
\expandafter#1%
\expandafter{%
\romannumeral\expandafter\@gobble\ziplists{#2}{#3}}%
}%
%%---------------------------------------------------------------
%% \zipset<macro-token 1>%
%% <macro-token 2>%
%% <macro-token 3>%
%% while <macro-token 2> expands to {{<e_1>}{<e_2>}..{<e_m>}}%
%% and <macro-token 3> expands to {{<f_1>}{<f_2>}..{<f_n>}}
%%
%% -> defines <macro-token 1> to expand to


%% <e_1><f_1><e_2><f_2>.. <e_m><f_m>
%%

%% (Due to \romannumeral-expansion, "constructing" the assignment
%% which in turn goes into TeX' stomach takes two \expandafter-
%% chains/two expansion-steps.)
%%...............................................................
\newcommand\zipset[3]{%
\romannumeral
\expandafter\@rmstop
\expandafter\def
\expandafter#1%
\expandafter{%
\romannumeral
\expandafter\PassFirstToSecond\expandafter{#3}{%
\expandafter\PassFirstToSecond\expandafter{#2}{%
\expandafter\@gobble\ziplists
}%
}%
}%
}%
\makeatother


\begin{document}


\verb|\setzip\actual{\zero\two\four}{\one\three}|:\newline
\setzip\actual{\zero\two\four}{\one\three} \string\actual:
\meaning\actual
\def\expected{\zero\one\two\three}%
\ifx\actual\expected\else\ddt\fi


\def\b{\zero\two\four}%
\string\b: \meaning\b\newline
\def\c{\one\three}
\string\c: \meaning\c\newline
\verb|\zipset\a\b\c|:\newline
\zipset\a\b\c
\string\a: \meaning\a
\def\expected{\zero\one\two\three}%
\ifx\a\expected\else\ddt\fi

\end{document}

Jonathan Fine

unread,
Jun 10, 2010, 8:35:35 AM6/10/10
to
Joseph Wright wrote:

> \cs_set_nopar:Npn \exp_last_unbraced:NNNV {
> \::N \::N \::V_unbraced \:::
> }
> \tl_new:N \l_tl_zip_tmpa_tl
> \tl_new:N \l_tl_zip_tmpb_tl
> \cs_new:Npn \tl_zip:NNN #1#2#3 {
> \tl_set_eq:NN \l_tl_zip_tmpa_tl #3
> \tl_clear:N #1
> \tl_map_inline:Nn #2
> {
> \tl_put_right:Nn #1 {##1}
> \tl_if_empty:NF \l_tl_zip_tmpa_tl
> {
> \tl_put_right:Nx #1
> {
> \exp_last_unbraced:NNNV \exp_after:wN \exp_not:N
> \tl_head:w \l_tl_zip_tmpa_tl \q_nil
> }
> \tl_set:Nx \l_tl_zip_tmpa_tl
> {
> \exp_last_unbraced:NNNV \exp_after:wN \exp_not:N
> \tl_tail:w \l_tl_zip_tmpa_tl \q_nil
> }
> }
> }
> }
> \cs_new:Npn \tl_zip_aux:n #1 { }

Thank you for this, Joseph. However, I don't understand this code. I'd
appreciate a translation into ordinary TeX macros.

Will it work with ConTeXt? I didn't say so, but I think it is a
requirement.

--
Jonathan

Jonathan Fine

unread,
Jun 10, 2010, 8:40:29 AM6/10/10
to
Ulrich D i e z wrote:
> [ This posting supersedes my posting
> news:huqi65$5s0$00$1...@news.t-online.com
> as the macros \setzip and \zipset provided therein can be
> shortened without affecting functionality.
> I don't know what I was thinking... ]

[TeX macros snipped]

Thank you for this, Ulrich. I guess some of your commands, such as
\PassFirstToSecond don't really belong in the 'public' namespace.

What you've written is very different from Joseph's. I'm hoping to post
my own answer soon, at least in pseudocode.

--
Jonathan

Jonathan Fine

unread,
Jun 10, 2010, 8:59:42 AM6/10/10
to
Jonathan Fine wrote:

> I'd quite like, as an exercise, to write with other a macro something like
> \zipset\a\b\c
> which defines \a to be the Pythonesque zip of macros \a and \b.
>
> Example:
> \def\b{\zero\two\four}
> \def\c{\one\three}
> \zipset\a\b\c
> % equivalent to \def\a{\zero\one\two\three}
>
> Although perhaps a toy problem I think this is close enough to being
> part of a good toolkit to be interesting.
>

Here's part of an implementation that will avoid 'churning' the tokens
and which will have linear running time. (Not tested).

\let\ag\aftergroup
\begingroup
\begingroup
\ag\gdef
\ag\a
\ag{
[Store tokens]
[Contribute tokens aftergroup]
\ag}
\endgroup
\endgroup

[Store tokens]

\def\doit #1{%
\ifx #1\sentinel
\advance\counter 1
\toks\counter{#1}
\expandafter\doit
\fi
}

\expandafter\doit\b\sentinel
\expandafter\doit\c\sentinel


[Contribute tokens aftergroup]
[For appropriate values of \counter]
[Contribute token aftergroup]

[Contribute token aftergroup]

\ag\the
\ag\toks
\uccode`X\counter
\uppercase{\ag X}
\expandafter\ag\space


This pattern has a limit of 256 tokens in \c, but it does have linear
running time. I don't know how bad quadratic running time is with 256
tokens in the answer. 128 * 128 == 16384, which is quite large.

--
Jonathan

Jonathan Fine

unread,
Jun 10, 2010, 9:00:45 AM6/10/10
to
Jonathan Fine wrote:
> Jonathan Fine wrote:

> \let\ag\aftergroup
> \begingroup
> \begingroup
> \ag\gdef

Oops. Should be \xdef.

--
Jonathan

Joseph Wright

unread,
Jun 10, 2010, 10:24:42 AM6/10/10
to
On Jun 10, 1:35 pm, Jonathan Fine <J.F...@open.ac.uk> wrote:
> Thank you for this, Joseph.  However, I don't understand this code.  I'd
> appreciate a translation into ordinary TeX macros.

Most of this is straight-forward, luckily. A bit of background first.
As you might know, the idea of the expl3 environment is to provide a
more standard programming environment for TeX. So there is supposed to
be some systematic approach to the naming and so forth. More
importantly in this context, expl3 uses ":" and "_" as "extra" letters
for control sequence names (rather than "@"), and also whitespace is
ignored.

Translation mainly requires construction of \tl_map_inline:Nn, but
I've decided to use what in expl3 would be \tl_map_function:NN instead
as for a translation to traditional LaTeX2e it is a bit easier.

\documentclass{article}
\makeatletter
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\def\zip@tmp{}
\def\tl@zip#1#2#3{%
\let\zip@tmp#3
\def#1{}%
\def\zip@aux##1{%
\edef#1{\unexpanded\expandafter{#1}\unexpanded{##1}}%
\ifx\zip@tmp\@empty
\else
\edef#1{%
\unexpanded\expandafter{#1}%
\expandafter\expandafter\expandafter
\noexpand\expandafter\@car\zip@tmp\@nil
}%
\edef\zip@tmp{%
\expandafter\expandafter\expandafter\unexpanded
\expandafter\expandafter\expandafter
{\expandafter\@cdr\zip@tmp\@nil}%
}%
\fi
}%
\expandafter\map@function\expandafter\zip@aux#2\@tail\@stop
\edef#1{\unexpanded\expandafter{#1}\unexpanded\expandafter{\zip@tmp}}
%
}
\def\map@function#1#2{% \tl_map_function:NN
\if@tail@stop{#2}%
#1{#2}%
\map@function#1
}
\def\if@tail@stop#1{%
\expandafter\ifx\if@tail@stop@aux@i#1?\@nil\@tail\@tail
\expandafter\if@tail@stop@aux@ii
\fi
}
\def\if@tail@stop@aux@i#1#2\@nil\@tail{#1}
\def\if@tail@stop@aux@ii#1\@stop{}
\def\@tail{\@tail}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%


\def\b{\zero\two\four}
\def\c{\one\three}

\tl@zip\a\b\c
\show\a
\begin{document}
\end{document}

> Will it work with ConTeXt?  I didn't say so, but I think it is a
> requirement.

Well an expl3 solution won't, but of course doing the same thing in
primitives will.
--
Joseph Wright

Jonathan Fine

unread,
Jun 10, 2010, 11:19:38 AM6/10/10
to
Joseph Wright wrote:

> %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
> \def\zip@tmp{}
> \def\tl@zip#1#2#3{%
> \let\zip@tmp#3
> \def#1{}%
> \def\zip@aux##1{%
> \edef#1{\unexpanded\expandafter{#1}\unexpanded{##1}}%

Excuse my ignorance, but what's \unexpanded. I've discovered its a
pdftex primitive, but haven't found documentation for it on the web.

I seem to recall that it's similar to \the\toksregister. In any case,
do you need it here? Won't

\let\xa\expandafter
\xa \def
\xa #1
\xa {
#1
##1
}

do the job just as well?

--
Jonathan

luigi scarso

unread,
Jun 10, 2010, 11:31:01 AM6/10/10
to
Some other ideas,
maybe you will find them useful.


\starttext

\def\myzipthem#1#2%
{\ctxlua{A = {}; B = {}}
\def\dozipthemA##1{\ctxlua{table.insert(A,[[\string##1]])}}%
\def\dozipthemB##1{\ctxlua{table.insert(B,[[\string##1]])}}%
\handletokens#1\with\dozipthemA
\handletokens#2\with\dozipthemB%
\ctxlua{%
for i=1,math.max(table.maxn(A),table.maxn(B)) do
local a = ""
local b = ""
if A[i] then a = '\\string' ..A[i] end
if B[i] then b = '\\string' ..B[i] end
tex.sprint(a,b)
end
tex.print([[\blank]])
for i=1,math.max(table.maxn(A),table.maxn(B)) do
local a = A[i] or ""
local b = B[i] or ""
tex.sprint(a,b)
end
tex.print([[\blank]])
}
}
\def\Alpha{A}
\def\Beta{B}
\def\Gamma{C}
\def\Delta{D}
\myzipthem{\Alpha\Beta}{\Gamma\Delta}%gives \Alpha\Gamma\Beta\Delta
\myzipthem{}{\Gamma\Delta}
\stoptext

Joseph Wright

unread,
Jun 10, 2010, 11:43:19 AM6/10/10
to
On Jun 10, 4:19 pm, Jonathan Fine <J.F...@open.ac.uk> wrote:
> Joseph Wright wrote:
> > %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
> > \def\zip@tmp{}
> > \def\tl@zip#1#2#3{%
> >   \let\zip@tmp#3
> >   \def#1{}%
> >   \def\zip@aux##1{%
> >     \edef#1{\unexpanded\expandafter{#1}\unexpanded{##1}}%
>
> Excuse my ignorance, but what's \unexpanded.  I've discovered its a
> pdftex primitive, but haven't found documentation for it on the web.

It's an e-TeX primitive, and is rather tersely described in the e-TeX
manual. \unexpanded acts as though the material was placed in a toks
(lets call it \Magictoks) before using \the\Magictoks within the \edef
(or \write, etc.). However, there is no actual assignment necessary,
which makes \unexpanded a bit faster.

> I seem to recall that it's similar to \the\toksregister.  In any case,
> do you need it here?  Won't
>
>      \let\xa\expandafter
>      \xa \def
>      \xa #1
>      \xa {
>          #1
>          ##1
>      }
>
> do the job just as well?

In most but not in all cases. In expl3, I realised a while ago that we
can do

\edef\test{\unexpanded{<material>}}

to let \test hold # tokens. With your approach, the lists to zip
cannot contain '#', whereas with \unexpanded they can. (You could use
an intermediate toks as you suggest, but \unexpanded is faster.)

One of the recent changes to expl3 was to alter the definition of
"token list variables" (macros used to store tokens) to use this
\unexpanded approach. Doing that means that you can store everything
you need to without bothering with toks at all (as inside an \edef you
can do \unexpanded\expandafter{\<macro>} without needing to have long
\expandafter chains).

Of course, if you want a solution that does not use e-TeX then you do
need to use either the \expandafter chain or a toks.
--
Joseph Wright

Joseph

Jonathan Fine

unread,
Jun 10, 2010, 11:47:17 AM6/10/10
to
luigi scarso wrote:
> \starttext
>
> \def\myzipthem#1#2%
> {\ctxlua{A = {}; B = {}}
> \def\dozipthemA##1{\ctxlua{table.insert(A,[[\string##1]])}}%
> \def\dozipthemB##1{\ctxlua{table.insert(B,[[\string##1]])}}%
> \handletokens#1\with\dozipthemA
> \handletokens#2\with\dozipthemB%
> \ctxlua{%
> for i=1,math.max(table.maxn(A),table.maxn(B)) do
> local a = ""
> local b = ""
> if A[i] then a = '\\string' ..A[i] end
> if B[i] then b = '\\string' ..B[i] end
> tex.sprint(a,b)
> end
> tex.print([[\blank]])
> for i=1,math.max(table.maxn(A),table.maxn(B)) do
> local a = A[i] or ""
> local b = B[i] or ""
> tex.sprint(a,b)
> end
> tex.print([[\blank]])
> }
> }

Let LT = LuaTeX.

Thank you for this, Luigi. Am I right in thinking that this code sends
one or more strings back to LT, which LT then parses into tokens?

--
Jonathan

Jonathan Fine

unread,
Jun 10, 2010, 11:52:41 AM6/10/10
to
Joseph Wright wrote:

>> Won't
>>
>> \let\xa\expandafter
>> \xa \def
>> \xa #1
>> \xa {
>> #1
>> ##1
>> }
>>
>> do the job just as well?
>
> In most but not in all cases. In expl3, I realised a while ago that we
> can do
>
> \edef\test{\unexpanded{<material>}}
>
> to let \test hold # tokens. With your approach, the lists to zip
> cannot contain '#', whereas with \unexpanded they can. (You could use
> an intermediate toks as you suggest, but \unexpanded is faster.)

Yes, that is a wart of sorts in TeX. But it can be avoided. Make '#'
an active character, whose meaning is macro parameter character '#'. In
almost all situations such an active hash is the same as a real hash.

$ tex
This is TeX, Version 3.1415926 (MiKTeX 2.7)
**\let~#

*\def\a~1{\message{Hi, (~1)}}

*\a{hi there}
Hi, (hi there)
*

But when you need to do token processing, just temporarily set active
'#' to \relax (or indeed any unexpandable).

> One of the recent changes to expl3 was to alter the definition of
> "token list variables" (macros used to store tokens) to use this
> \unexpanded approach. Doing that means that you can store everything
> you need to without bothering with toks at all (as inside an \edef you
> can do \unexpanded\expandafter{\<macro>} without needing to have long
> \expandafter chains).
>
> Of course, if you want a solution that does not use e-TeX then you do
> need to use either the \expandafter chain or a toks.

Or you could use active '#' as described above.

--
Jonathan

luigi scarso

unread,
Jun 10, 2010, 12:00:10 PM6/10/10
to

\handletokens is the key here.
If you are using context mkii it should be defined in supp-mis.mkii;
in mkiv it should be defined in syst-aux.mkiv

Joseph Wright

unread,
Jun 10, 2010, 12:58:02 PM6/10/10
to
On Jun 10, 4:52 pm, Jonathan Fine <J.F...@open.ac.uk> wrote:
> > With your approach, the lists to zip
> > cannot contain '#', whereas with \unexpanded they can. (You could use
> > an intermediate toks as you suggest, but \unexpanded is faster.)
>
> Yes, that is a wart of sorts in TeX.  But it can be avoided.  Make '#'
> an active character, whose meaning is macro parameter character '#'.  In
> almost all situations such an active hash is the same as a real hash.
>
> $ tex
> This is TeX, Version 3.1415926 (MiKTeX 2.7)
> **\let~#
>
> *\def\a~1{\message{Hi, (~1)}}
>
> *\a{hi there}
> Hi, (hi there)
> *
>
> But when you need to do token processing, just temporarily set active
> '#' to \relax (or indeed any unexpandable).

This is true for the situation where you control the input, but for
expl3 the idea is to be general. The original approach was to stick to
two data types:

- tl ("token list variable" = macro used for storage), to store
everything except input which might contain '#'.
- toks, to store really anything but with the proviso that inside an
\edef you only get one expansion.

That is of course what you do with plain TeX or LaTex2e. However, you
are then left having to explain the detail about how the variables
"work", which is very TeX-specific. By consistently using a definition
for tl's based on \unexpanded we end up with just one data type for
tokens

- tl, to store everything, and which can be restricted inside an
\edef using \unexpanded\expandafter{...} (which in expl3 is nicely
"wrapped up" as \exp_not:V)

As I say, whether this is important depends on the context: if you
know the input will never contain '#' then you don't have to worry.
With my LaTeX3 'hat' on I don't have that luxury.
--
Joseph Wright

Jonathan Fine

unread,
Jun 11, 2010, 4:40:23 AM6/11/10
to

> \handletokens is the key here.


> If you are using context mkii it should be defined in supp-mis.mkii;
> in mkiv it should be defined in syst-aux.mkiv

A lot of the code above, Luigi, is quite new to me. And so I've asked a
question. I've had a look at the definition of \handletokens and it
doesn't help me answer my question (which I thought was clear enough).

--
Jonathan

Jonathan Fine

unread,
Jun 11, 2010, 4:48:03 AM6/11/10
to
Joseph Wright wrote:
> On Jun 10, 4:52 pm, Jonathan Fine <J.F...@open.ac.uk> wrote:
>>> With your approach, the lists to zip
>>> cannot contain '#', whereas with \unexpanded they can. (You could use
>>> an intermediate toks as you suggest, but \unexpanded is faster.)
>> Yes, that is a wart of sorts in TeX. But it can be avoided. Make '#'
>> an active character, whose meaning is macro parameter character '#'. In
>> almost all situations such an active hash is the same as a real hash.
>>
>> $ tex
>> This is TeX, Version 3.1415926 (MiKTeX 2.7)
>> **\let~#
>>
>> *\def\a~1{\message{Hi, (~1)}}
>>
>> *\a{hi there}
>> Hi, (hi there)
>> *
>>
>> But when you need to do token processing, just temporarily set active
>> '#' to \relax (or indeed any unexpandable).
>
> This is true for the situation where you control the input, but for
> expl3 the idea is to be general. The original approach was to stick to
> two data types:

Hegel said something like: To do something great one must learn to limit
oneself.

I'd be surprised if letting active hash to a macro parameter hash
character at the very beginning of LaTeX would break anything. Do you
have any evidence to the contrary. (I admit I don't have evidence in
support - yet.)

> - tl ("token list variable" = macro used for storage), to store
> everything except input which might contain '#'.
> - toks, to store really anything but with the proviso that inside an
> \edef you only get one expansion.
>
> That is of course what you do with plain TeX or LaTex2e. However, you
> are then left having to explain the detail about how the variables
> "work", which is very TeX-specific. By consistently using a definition
> for tl's based on \unexpanded we end up with just one data type for
> tokens
>
> - tl, to store everything, and which can be restricted inside an
> \edef using \unexpanded\expandafter{...} (which in expl3 is nicely
> "wrapped up" as \exp_not:V)

> As I say, whether this is important depends on the context: if you
> know the input will never contain '#' then you don't have to worry.
> With my LaTeX3 'hat' on I don't have that luxury.

I think it's time LaTeX3 gave itself the luxury of a smaller and less
general hat.

--
Jonathan

David Kastrup

unread,
Jun 11, 2010, 5:19:20 AM6/11/10
to
Jonathan Fine <J.F...@open.ac.uk> writes:

I wouldn't be surprised. The output form changes in that case, not
duplicating '#' chars in macros anymore, while the input still requires
the duplication. Since LaTeX passes a lot of information through
intermediary files, this is not likely to be without problem.

LaTeX is complex enough that a comprehensive solution fixing all
problems occuring because of that is not likely to be easy.

> I think it's time LaTeX3 gave itself the luxury of a smaller and less
> general hat.

Would it be LaTeX then?

--
David Kastrup
UKTUG FAQ: <URL:http://www.tex.ac.uk/cgi-bin/texfaq2html>

Joseph Wright

unread,
Jun 11, 2010, 6:44:12 AM6/11/10
to
On Jun 11, 9:48 am, Jonathan Fine <J.F...@open.ac.uk> wrote:
> I'd be surprised if letting active hash to a macro parameter hash
> character at the very beginning of LaTeX would break anything.  Do you
> have any evidence to the contrary.  (I admit I don't have evidence in
> support - yet.)

There is a strong feeling that we should probably make '#' into an
'other' character \AtBeginDocument in a putative LaTeX3 format.
However, that doesn't help in the preamble or inside code sections.
The point of expl3 is to provide a coding environment, and for that we
want well-behaved variables. The \unexpanded method means that we can
have a well-behaved "token list variable" which works.
--
Joseph Wright

luigi scarso

unread,
Jun 11, 2010, 8:02:37 AM6/11/10
to
On Jun 11, 10:40 am, Jonathan Fine <J.F...@open.ac.uk> wrote:
> A lot of the code above, Luigi, is quite new to me.  And so I've asked a
> question.  I've had a look at the definition of \handletokens and it
> doesn't help me answer my question (which I thought was clear enough).
The code above is quite simple:
the main idea is that a list is empty or has one or more elements,
where an element is a token

Here I declare to *global* lua table (ok, not a good idea)


\ctxlua{A = {}; B = {}}

These macros insert the \string of their arguments into table A and
B


\def\dozipthemA##1{\ctxlua{table.insert(A,[[\string##1]])}}%
\def\dozipthemB##1{\ctxlua{table.insert(B,[[\string##1]])}}%

\handletokens#1\with#2 handle tokens of #1 with macro #2
I dont' really care about hadletokens now -- it just does what I need.

\handletokens#1\with\dozipthemA
\handletokens#2\with\dozipthemB%

The net effect is that I collect the \string version of token into
table A and B.
Zip is now trivial: here I use lua code to print the zipped list
element by element
(it's trivial to collect each element of the zipped list into another
table C, btw )


\ctxlua{%
for i=1,math.max(table.maxn(A),table.maxn(B)) do
local a = ""
local b = ""
if A[i] then a = '\\string' ..A[i] end
if B[i] then b = '\\string' ..B[i] end
tex.sprint(a,b)
end

If I want to evaluate the list

Jonathan Fine

unread,
Jun 11, 2010, 8:15:28 AM6/11/10
to
luigi scarso wrote:
> On Jun 11, 10:40 am, Jonathan Fine <J.F...@open.ac.uk> wrote:
>> A lot of the code above, Luigi, is quite new to me. And so I've asked a
>> question. I've had a look at the definition of \handletokens and it
>> doesn't help me answer my question (which I thought was clear enough).

[snip]

> The net effect is that I collect the \string version of token into
> table A and B.
> Zip is now trivial: here I use lua code to print the zipped list
> element by element

[snip]

> \ctxlua{%
> for i=1,math.max(table.maxn(A),table.maxn(B)) do
> local a = ""
> local b = ""
> if A[i] then a = '\\string' ..A[i] end
> if B[i] then b = '\\string' ..B[i] end
> tex.sprint(a,b)
> end

Thanks, Luigi. This is most helpful. The interface between LT and Lua
is via strings.

I assume that LT uses catcodes to turn input strings into tokens. If
so, then there will be inputs that can't be processed correctly, because
the output can't be created without category code changes. (I'm
allowing characters as well as control sequences in the input.)

For example, and please correct me if I'm wrong, the Lua side can't
produce the answer for
\def\b{B}
\escapechar -1
\edef\c{\string \B}
\zipset\a\b\c
Expect letter B followed by character B.

If I'm right then using Lua to solve the quadratic run time problem
introduces another problem (which needs to be thought about).

--
Jonathan

Jonathan Fine

unread,
Jun 11, 2010, 4:21:02 PM6/11/10
to
David Kastrup wrote:
> Jonathan Fine <J.F...@open.ac.uk> writes:

>> I'd be surprised if letting active hash to a macro parameter hash
>> character at the very beginning of LaTeX would break anything. Do you
>> have any evidence to the contrary. (I admit I don't have evidence in
>> support - yet.)
>
> I wouldn't be surprised. The output form changes in that case, not
> duplicating '#' chars in macros anymore, while the input still requires
> the duplication. Since LaTeX passes a lot of information through
> intermediary files, this is not likely to be without problem.

I've hacked a source file to make the required changes.
===
latex-hash$ rcsdiff -r1.1 -r 1.2 latex.ltx
rcsdiff: RCS/1.2,v: No such file or directory
===================================================================
RCS file: RCS/latex.ltx,v
retrieving revision 1.1
retrieving revision 1.2
diff -r1.1 -r1.2
85a86,90
>
> \let\@@hash=#
> \catcode`\# 13
> \let #\@@hash
>
287a293,297
>
> \let\@@hash=#
> \catcode`\# 13
> \let #\@@hash
>
2182c2192
< \catcode`\#6%
---
> \catcode`\#\active%
7856a7867,7871
>
> \let\@@hash=#
> \catcode`\# 13
> \let #\@@hash
>
===

The resulting format file, which must be called something other than
latex.fmt, seems to work just fine for me.

David's quite right, that in the special circumstance of \message and
\write (and perhaps a few others) active hash and character hash differ
in that the former does not double hash and the latter does.

I think that within LaTeX this can be handled by adding
\edef #{\string#\string#}
to the behaviour of \protect. Before incorporating such a change into a
release version of LaTeX one would want to make careful and rigorous
tests first, but I have no doubt that it could be made at least 99.999%
compatible for real-world documents.

--
Jonathan

Taco Hoekwater

unread,
Jun 12, 2010, 3:08:15 AM6/12/10
to Jonathan Fine
Jonathan Fine wrote:
> luigi scarso wrote:
>> On Jun 11, 10:40 am, Jonathan Fine <J.F...@open.ac.uk> wrote:
>>> A lot of the code above, Luigi, is quite new to me. And so I've asked a
>>> question. I've had a look at the definition of \handletokens and it
>>> doesn't help me answer my question (which I thought was clear enough).
>
> [snip]
>
>> The net effect is that I collect the \string version of token into
>> table A and B.
>> Zip is now trivial: here I use lua code to print the zipped list
>> element by element
>
> [snip]
>
>> \ctxlua{%
>> for i=1,math.max(table.maxn(A),table.maxn(B)) do
>> local a = ""
>> local b = ""
>> if A[i] then a = '\\string' ..A[i] end
>> if B[i] then b = '\\string' ..B[i] end
>> tex.sprint(a,b)
>> end
>
> Thanks, Luigi. This is most helpful. The interface between LT and Lua
> is via strings.

Not necessarily, but Luigi's code does.

It would be possible to store the two lists in token registers before
calling the lua side, and let lua code merge the two token registers
directly into a third token list (so that is an interface via 3 token
registers).

Best wishes,
Taco

0 new messages