Colin makes a valid argument that the functionality is useful beyond[proc], [method], and [apply] of λ-forms. In particular, invocationsof coroutines would conceivably want to parse keyword parameters. (I'mnot quite following the argument about [interp alias], but that reallydoesn't matter. A single coherent example is enough to demonstratethat the set is incomplete.) That militates in favor of doing argumentprocessing with [eatargs].
I initially disfavored this approach for three reasons. The first isthat it's some amount of additional work for program analysis. As faras I can grasp from Colin's verbose screeds, he seems to be labouringunder the misconception that I was referring to human programunderstanding.
The restrictions on the semantics of named args that Kevin discusses in this post seem reasonable to me. This is the right time to implement such restrictions, before any legacy code exists.
My performance-related objection to eatargs is that it would seem to preclude optimizations by the compiler guided by the arg syntax. After reading Kevin’s discussion of what the compiler is currently doing, it seems that’s somewhat covered a transition to the new proc syntax could eventually allow for simplifications of the compiler as upvar transitions over time to -upvar.
The existence of a standalone entry point for the parameter parsing is a good idea, but it’s really a separate issue from whether proc and other tools that use the underlying API should include this parameter parsing. A separate standalone entry point for handling things like script level command line parsing could still be created even if this code were always used by proc.
Given that it’s been carefully designed to never conflict with existing proc syntax, and there’s no overhead at runtime for code that doesn’t use the extended syntax, I’m still having trouble understanding the objections to including it in proc.
I'm willing to entertain the idea of having extended versions of
[proc], [method] and [apply] that have the extended parameter
specifications built in. Nevertheless, I've been convinced that
coroutine invocations, at least, require that the [eatargs]
functionality be available stand-alone. There just isn't any place to
hang the information about what [yieldto] is expecting for its next
set of arguments.
Given that [eatargs] has to be available for that function, my natural
inclination is to avoid the syntactic sugar of decorating [proc],
[method] and [apply] with the arg specs, in favor of leading off with
[eatargs]. I started off favoring the approach of having the arg specs
on the procedure-defining commands, but once I realized that [eatargs]
is needed in other contexts, I reconsidered. It's not that strong a
preference, and putting decorated params back in wouldn't be a
showstopper for me. Not having a stand-alone [eatargs] available would
be a showstopper at this point.
As far as the injection of code goes, anything that today has similar
functionality (keyword parameters, upvar and uplevel, etc.) puts the
requirement on the instrumentor that the injected code be able to deal
with 'args', with variable names that have not yet been passed to
'upvar', and so on. If I understand you correctly, you're complaining
about new functionality that [eatargs] would not provide, rather than
existing functionality that would be lost.
>> 2. Introspection without an extended [info]
>> -------------------------------------------
>> It occurs to me that we could get most of our introspection back if
>> we adopt Tk's way of doing business - ask a command itself what its
>> syntax is.
>
> Calling a proc to obtain a usage message is convenient in specific
> interactive cases, but far away from a generally available mechanism.
>
> Of all potential future procedures taking an "args" on first line, a
> not-neglectible fraction would actually take args as a list of objects
> to deal with. And debugging/instrumentation tools just cannot rely
> on it. (Reminds me of my recent attempt of getting a usage from/for
> 'namespace ensemble create' - still had to look for the man page.)
>
> That problem about a correct usage message even for ensemble-members
> appears to me like dwarfed by the problem of general availability.
>
> PS: auto-checking for certain options in some "eatargs" implementation
> would induce even worse data-triggered bugs, than those (meanwhile
> solved!) ones in a previous version of the tip457.
I'll concede this to be the weakest plank in the platform. I really
don't have a good general answer. Perhaps I'm putting too much weight
on introspectability.
>> 3. Combinations of Syntax
>> -------------------------
>> There is a lot of discussion about combinations of keyword parameters,
>> optional positional parameters, and 'args', and about the cost of deep
>> parameter inspection and the risks of injection attacks. When proposed
>> limitations are raised, though, people keep presenting counterexamples
>> of particular Core commands that can't be modeled under these
>> restrictions.
>
> Nobody here raised those commands that really cannot be modelled in a
> param-spec ("if" as most prominent example). All the commands raised
> so far do make sense as being describeable with a param spec.
>
> My dream of tip-457-and-beyond would have been, that even all C-coded
> commands would eventually get a means to describe their interface as a
> param-spec, retrieveable by [info args ...] and only very few of them
> (like "if") would fall back to "args".
>
> There is of course a chasm between param-specs that are merely unambiguous
> for binding, and those that are even unambiguous at compile-time before the
> values are known. Even with a perfectly restricted system of compile-time
> predetermined param-specs there are still cases (around {*}-expansion, or
> with a "--" potentially introduced by a subst), that cannot be compiled,
> anyway, but can still be legally called.
>
> Trying to tie it down to prevent not-compileable invocations seems like
> a "wag-the-dog" case to me.
You misunderstand me. I'm not trying to prevent non-compilable
invocations, and some form of 'eval' will always be with us. What I am
trying to do is to make sure that the common static cases will be
compilable. The nastiness that surrounded [switch] is an example,
where code could specify all the -keywords as constants, and still not
be compilable because it was not provable that variable args were not
-keywords. I don't want the latter to become the common case. Falling
back on interpreted code should be the exception rather than the rule.
> Apart from that you omitted the necessary "sub-zero"-guard for args, it's
> essentially what I've done in my procx.
OK, it was a quick hack. Feel free to correct the details. At least the
examples of the behaviour are correct, aren't they?
>> 4. Keyword parameters
>> ---------------------
>> For clarity, I shall use 'noun', 'adverb', 'preposition' and 'object'
>> to describe these components.
>
> Splendid! Not sure about the "nouns", though. Are these the required
> positionals, versus "object" the defaulted ones?
I had intended 'nouns' to encompass required, optional, and 'args';
'objects' are associated with the prepositions. (I'm sufficiently
rusty on German grammar not to be able to translate the
technical term, and the corresponding concept in German
confuses me in any case: I can manage to distinguish
'ich lege das Buch auf deN Tisch' from 'das Buch liegt
auf deM TischE', but phrases like 'außerhalb VOM Garten'
versus 'außerhalb DES GartenS' or 'trotz deM Wetter'/
'trotz deS WetterS' strain both my memory and my comprehension.
>> The parser should know when an adverb or preposition is expected and
>> simply be able to examine what word it is, ...
>> As a corollary to this rule, we cannot have a varying number of
>> arguments before the keyword parameters begin.
>
> This corollary is not entirely true: given a contrived proc like:
> proc log {message {qualifier {}} {qualopts -name {...}}} { ... }
> then qualifier options could only be passed after a qualifier.
> This is *almost* the same thing as having a series of defaulted
> params in current proc: to provide a value for a latter one, one
> must provide values to all former ones.
Yes, we could contrive such a thing. I'd be hard-put, though, to
come up with a general set of rules that would make it unambiguous.
Would the qualifier be interepreted as such only if it doesn't begin
with a hyphen? Only if it doesn't match one of the qualopts?
Can a qualifier have text that matches one of the qualopts, and
how would you specify that?
> There is, however, an important difference between *required* objects
> and *required* adverbs/prepositions - something I tried to solve in
> procx and consider failed: the point being is, that one doesn't know
> how much to "reserve" for required named params following lateron.
Exactly. Stu has also pointed out on the Chat, at least, that
it is useful to allow multiple instances of the same preposition,
with later instances overriding earlier ones. That allows for
defaults to be supplied at the start of a command line with
a later interpolation of {*}$args to override them. So even if
all your keyword parameters are -required 1, you still don't
know how many you have.
>> Clearly, the keyword parameters must follow next, in a block. The
>> Unix command-line practice of allowing non-keyword arguments
>> interspersed among the keyword arguments has nothing to recommend it
>> to Tcl.
>
> Most of the unix commands stop accepting options once the first
> non-option or "--" is encountered. (iirc that's a feature of usual
> getopt() impl's) Only rare commands accept further options interspersed
> with objects.
The toolchain commands are the chief offenders here. 'gcc', 'ld'
and similar commands allow -options and file names to be
interspersed helter-skelter, and even are sensitive to the ordering
of the -options. I want that sort of thing to be Out Of Scope
for a standard args-parsing procedure.
> I feel "guilty" for having brought in the possibility of multiple
> blocks of named params, and I did it only in the light that it would
> be a dead giveaway. The "cases" I had in mind involved separating
> multiple blocks of named params by literally given subcommands,
> which even a compiler could have got right, but it's not worth
> jumping through loops if that would actually be necessary.
Don't feel guilty at all. In a proposal like this, it's important to
consider all the cases. We can decide that certain cases are
out of scope, but that should be a conscious decision, and
we should have a clear specification of what the eventual
implementation will and will not address.
>> Deprecated alternative: non-hyphen args
>> ---------------------------------------
>> We could also adopt the rule that if an argument without a leading
>> hyphen is encountered where a keyword is expected, that ends the
>> keyword arguments. Core commands such as [puts] follow this rule.
>> Note, however, that adopting this rule means that we are requiring
>> deep inspection of the parameter data.
>
> This deep inspection of the parameter data isn't an issue if the data
> is constrained by the proc's semantics. Who cares, if a channel name
> or a widget path name is inspected for a leading dash?
I should perhaps have said 'disfavoured' rather than 'deprecated'.
I'm willing to permit it, but with the clear warning that it lacks full
generality and invites errors.
Inspecting a widget name or a channel name for a
leading hyphen is fine. But the temptation to use similar syntax
with file names, command and variable names, and data from
external sources will be almost irresistible if this becomes
a popular approach. We're all aware that those things can indeed
contain leading hyphens, legitimately. We do not want the
correct interpretation of a command to be jeopardized by
an unusual but legal name for something. The extra cost
of having to write something like '-fromfile $filename'
in place of '$filename' will sometimes be the price of
getting an unambiguous and safe parse.
> The compiler cannot know about these, but in cases where the values are
> given literally it shouldn't have a problem at all, and in other cases
> like that of puts with a channel, the compiler just has to sigh and emit
> an "invokeStk #" (or whatever equivalent in quadcode).
See above. I'm less concerned with the ability to compile as I am
with the ability to interpret the command syntax correctly.
> One even much worse paradigm, that new param-specs don't (and really
> shouldn't) follow is that of abbreviating option names down to the
> shortest unique prefix. Instead a proc/commmand should specify some
> abbreviations that make sense, and that will be preserved even if it
> later grows a new option with same prefix. Tip 457 does this correctly,
> and I send a "no!" to those who request this utterly broken unique-prefix
> bungle for tip-457 named params.
Amen..
>> The syntax winds up being something like:
>> [command] [mandatory arg]* ([adverb] | [prepositional phrase])* \
>> '--' ([mandatory arg] | [optional arg] | 'args')*
>
> Rather:
> [command] [mandatory arg]* [optional arg]* \
> ( ( [adverb] | [prepositional phrase] )+ '--' )? \
> ([mandatory arg] | [optional arg] | 'args')*
>
> I'm not going to "fight" for the extra [optional arg]*, but it is
> technically unambiguous *even if used with non-literals*.
I already asked above. I'm not certain that I follow your argument
about the ambiguity. It strikes me that you haven't ruled out all the
evil cases.
In any case, anywhere that mandatory and optional
args can appear together, they can be intermixed freely.
As long as we know how many optional args we need to
fill, we can come up with the mapping unambiguously.
> The '--' obviously shouldn't even be allowed *unless* there exist
> named params.
Yeah.
> The addendum about only one 'args' and about '--' being optional in
> some cases depending on last part of course still applies.
Right. It's important to support at least the syntax of
verb -adverb -preposition object ... noun
without needing the -- to introduce the nouns.
What’s the advantage of this over plan C:
* extend proc per tip #457
* also add a standalone parser function that uses the same engine
Both cases it’s extending proc.
The only downside is that if someone comes up with a better way of extending proc they can’t do it. Even if someone in the future manages to come up with something better than TIP#457 (which is, by the way, designed for extensions), it will have to swim upstream against the TIP#457 syntax in the parser that everyone is using already.
Well, it is a question of priorities, and varied return on investment.
For those who absolutely depend on Nagelfar or similar to statically
isolate call sites where the signature is violated, then indeed with
[eatargs] an extra investment is needed from Nagelfar, to reach out
into the proc's body to find [eatargs]. Note this is doable, since
Kevin's and Donal's quadcode is ready to do the same. And Nagelfar
will also need an update to cope with TIP457...
Then there's the harder question of "injecting code at the beginning".
Can you clarify the use case ?
Indeed if you inject [if {![llength $args]} {error "At least one
option please"}], then the semantics of the [eatargs]-calling proc
will be wildly different from that of the TIP457-based proc.
Then it will be completely sound for Nagelfar or Quadcode to decide
that "no, the signature is not simply ARGSPEC", hence their failure to
find the [eatargs ARGSPEC] at the beginning is no felony.
See how the taste of those grapes depend on how much of the stem and
branch and leaves you swallow with them ;-) ?
> If (merely for sake of discussion) we dropped named args, and resorted
> to plain tip-288, leaving it to user code to extract options
> from $args, then they'd still potentially have optional arguments
> before 'args' (and thus before options) and coincidentally with exactly
> the same resulting semantics w.r.t preceding optional positionals that
> I tried to explain.
>
> If we modeled named arguments as being constrained to a tip-288 'args'
> part, then the optional positionals would even be predetermined - though
> at the "semantic cost" that even optional positionals to the right of
> the named params would get binding priority over the named params.
Yep, that's the idea of a refinement that aspect and I simultaneously
came up with yesterday:
- let proc just allow for TIP288, or even better, Kevin's improved variant:
{a {b B} args {c C} d {e E}}} ;# Kevin's message details the
priorities: mandatory, then default, then args
- let [eatargs] chew on args only, spitting back the new value of
args (which may be non-empty if "args" appears in ARGSPEC):
set args [eatargs $args ARGSPEC]
(
Note that the latter can be decided to be abbreviated as
eatargs args ARGSPEC ;# since args is a r/w variable here
or even more concisely as
eatargs ARGSPEC ;# since args is the only var we'll need to
reach. Yep, special case. As in proc.
I'm completely agnostic about which of the three variants should make
it. Kevin prefers the first.
)
So, with this refinement, we'd write:
proc f {a {b B} args {c C} d {e E}}} {
set args [eatargs $args ARGSPEC-FOR-NAMED-ARGS] ;# Kevin's notation
...
}
This way:
- the arg protocol for proc remains on the safe side, with only
decisions based on arg counting and no string comparison (hence no
string rep generation). This can simply be an update of TIP288
matching Kevin's spec.
- named-args aficionados keep the right to use them in their
string processing glory
- direct introspection of mandatory and default args remains as
usual ; indirect introspection of named args is still doable as just
described.
- a slight "energy barrier" remains so that people don't
carelessly indulge into named-args ignoring the perf hit
- [eatargs] candidates can compete for ages, company per
company, group per group. Then if/when a local optimum is found,
shared, and consensual, it can be promoted either to TIP457 form or to
Frédéric's idea of {args ARGSPEC}:
proc f {a {b B} {args ARGSPEC-FOR-NAMED-ARGS} {c C} d {e E}}} {..}
-Alex
> What’s the advantage of this over plan C:
> * extend proc per tip #457
> * also add a standalone parser function that uses the same engine
> Both cases it’s extending proc.
From a documentation standpoint it keeps [proc] man page simple (this was one of the major objections).
It's also more easily transposable because [proc] is not privileged compared to other command-creating commands. Applying the {args ?parser?} reform to methods, lambdas etc. is very natural. Ditto for C-level commands.
The parser function is also freed from any compatibility constraints with the old proc syntax.
Last, it adds a layer of self-description if the parser function names are chosen adequately. E.g.:
proc p1 {{args gnu-args}} body
proc p2 {{args tk-args}} body
With [gnu-style] and [tk-args] being generic argument parsers that follow the GNU getopt and Tk configure/cget conventions.
> The only downside is that if someone comes up with a better way of
> extending proc they can’t do it. Even if someone in the future
> manages to come up with something better than TIP#457 (which is, by
> the way, designed for extensions), it will have to swim upstream
> against the TIP#457 syntax in the parser that everyone is using
> already.
Indeed. That's the point behind having a standalone parser instead of something that is closely tied to [proc].
> Colin makes a valid argument that the functionality is useful beyond
> [proc], [method], and [apply] of λ-forms. In particular, invocations
> of coroutines would conceivably want to parse keyword parameters.
Unless I'm misunderstanding what you mean by 'invocations of
coroutines', the new extended specifiers can be used when calling
[coroutine] to create a coroutine, either with a proc name or a
lambda.
% proc allNumbers {{st -name start -default 0} {inc -name incr -default 1}} {
yield
set i $st
while 1 {
yield $i
incr i $inc
}
}
% coroutine oddValues allNumbers -start 1 -incr 2
> (I'm not quite following the argument about [interp alias], but that really
> doesn't matter.
Same for [interp alias], the arguments can use keyword arguments if
the aliased command support them.
% proc log {{level -switch {debug}} msg} { ... }
% alias {} debug {} log -debug --
-- Mathieu
Unless I'm misunderstanding what you mean by 'invocations of
coroutines', the new extended specifiers can be used when calling
[coroutine] to create a coroutine, either with a proc name or a
lambda.
Resuming a coroutine through [yieldto] does not permit the usage of
basic features of proc arguments (default values, special 'args',
...). [yieldto] only return a list, it does not set any variables.
Why would the support of extended specifiers be a requirement in that case?
For now, a proc argument is at most a list of 2 elements : the variable name and eventualy a default value.
The solution given by Mathieu is to consider a list of more than 2 elements a « TIP#457 defined » argument.
You find another way : as args has no default value, it is at
most a list of 1 element, so the argspec could be added here too,
as a second element of a list begining by args.
There is maybe another solution : proc has 3 arguments, its name, its argument list, its body.
Let's say, Tcl would accept to define a proc with 2 arguments only.
proc procname {
body
}
Thanks Frederic and Florent for suggesting these alternates proposals.
I'm now seriously considering having a separate command, which can be
somewhat 'attached' to the related proc.
The proposal I currently have in mind is to add a fourth optional
argument to [proc] between 'procname' and 'arglist':
proc procName ?argParseCmd? argList body
The current proposal is at a dead-end, a consensus regarding the
alternate proposal must be found. Although I understand you prefer the
current one, this proposal is more extensible and the [proc]
modification will not be strictly bound to this arguments handling
protocol.
It's only one word to add to the proc definition (two if Brad's
suggestion is used), I'm not sure to understand why you are talking
about "far more complexity to programs using it".
-- Mathieu
Alexandre and Kevin have also stated that they will vote NO on the
current proposal.
>> a consensus regarding the alternate proposal must be found.
>
> The current alternate proposals are:
>
> 1. TIP 457
> 2. getargs/parseargs/eatargs/somethingargs as a separate command within the proc
> 3. TIP 457 plus an encapsulation of the same parsing as a standalone command
Not sure to understand that one. Are you talking about a separate proc
command (proc-ext?) which implement tip-457? In that case, it can't be
applied to proc-like commands (apply/lambda, TclOO
methods/constructor, ...).
> 4. The previous proposal a modified proc with 2 arguments, with kind of an “implicit args”
> 5. This proposal with 4 arguments, with the extra interstitial argument
>
> My preference would of course be #1 or #3, but #2 is a far better alternative than either of #4 or #5.
>
>> I'm not sure to understand why you are talking about "far more complexity to programs using it".
>
> Perhaps I don’t quite understand it, but it seems to imply adding an extra interstitial argument that will rewrite the proc at compile time, perhaps inserting a getargs function at the start?
Nope, (C-implemented) argument parser will provide an API for [proc]
to directly initialize the arguments. This will be very similar to the
current implementation, just organized differently.
-- Mathieu
My preference is to have [2], plus possibly an implementation of TIP
#288, possibly extended to allow optional args at arbitrary positions
along the lines of my earlier message.
My chief rationale is that I don't know what I don't know. If
[eatargs] becomes a benighted backwater, the conventional [proc] will
not have been touched, and we still could come up with another attempt
at getting it right, without having used up a lot of syntax. Moreover,
[eatargs] would be available not only to [proc], but also to
[oo::define ... constructor], [oo::define ... method], [apply],
[incr Tcl] methods, coroutine invocations, results from [yieldto],
and probably other things that I can't think of at the moment.
It doesn't cut across multiple areas of the Core code.
Introspection is an issue that I don't think we can resolve
gracefully, unfortunately.
For compilation, [eatargs] and and extended [proc] are about the same
level of difficulty, so don't base any conclusions on how easy or hard
it will be to generate high-performance code with this sort of scheme.
And yes, [eatargs] is a silly name, intentionally. Alexandre and I
have stuck with the name because we hoped it would avoid derailing the
discussion into a lengthy and ultimately fruitless argument over what
to call the thing.
Given that it's a separate command, my preference would actually be to
see it tried in a bundled package first, because we don't have any
experience at all with how it performs under fire. I compromise, and
say that it could go in Core, primarily because I recognize that
getting decent performance with the thing most likely needs some help
from the bytecode engine, and that's pretty inaccessible without
tendrils that penetrate into a good many of the Core's private
interfaces and data structures.
Le 13/06/2017 à 21:55, Peter da Silva a écrit :
> The current alternate proposals are:
> 1. TIP 457
> 2. getargs/parseargs/eatargs/somethingargs as a separate command within the proc
> 3. TIP 457 plus an encapsulation of the same parsing as a standalone command
> 4. The previous proposal a modified proc with 2 arguments, with kind of an “implicit args”
> 5. This proposal with 4 arguments, with the extra interstitial argument
>
> My preference would of course be #1 or #3, but #2 is a far better alternative than either of #4 or #5.
Consider, for #4, an « args » command, as an namespace ensemble command.
1 ° kind of argument parsing of proc.
The args command take 3 arguments,
First is the proc name (parameter)
Second is a subcommand
Third is an argument for the subcommand.
It returns the proc name
The proc command then recieve 2 arguments
a) as usal -> subcommand set
proc [args /*MyProc*/ set {
# usual argument parsing
a b c args
}] {
... body ...
}
b) via a script -> subcommand apply
proc [args /*MyProc*/ apply {
# script argument parsing
foreach e $args {
...
}
}] {
... body ...
}
c) via a command -> subcommand eval
proc [args MyProc eval MyParseCmd] {
... body ...
}
2 ° kind of argument parsing of a lambda :
args command take 3 arguments
First is an empty string as parameter
Second is a subcommand
Third is an argument for the subcommand.
It returns an args with a default (third way of frederic bonnet)
a) as usual
apply [list [args {} set {a b c args}] {
... lambda body ...
} /*namespace*/]
-> apply {{a b c args} {... lambda body ...} */namespace}
b) with a script
apply [list [args {} apply {# script to parse}] {
... lambda body ...
} /*namespace*/]
-> apply { {{args {apply {#script}}}} {... lambda body ...} */namespace}
c) with a command
apply [list [args {} eval MyParseCmd] {
... lambda body ...
} /*namespace]
-> apply { {{args {eval MyParseCmd}}} {... lambda body ...} */namespace}
3° As a standalone command :
---------------------------------------
args command takes then 4 arguments or more
args {} subcommand /*SubcommandArg*/ /*valueToBeParsed*/
/*valueToBeParsed*/ ...
Then it assign arguments, in the caller context, beyond the subcommand
specification
4° Finaly, for a coroutine :
proc coro {} {
set args {}
while 1 {
args {} set {a b c args} [yieldto lindex $result]
args {} apply {# script to parse} [yieldto lindex $result]
args {} eval MyParseCmd [yieldto lindex $result]