Jira (PUP-10513) Do not recursively type check collections when creating iterables in puppet functions

2 views
Skip to first unread message

Justin Stoller (Jira)

unread,
May 15, 2020, 6:21:02 PM5/15/20
to puppe...@googlegroups.com
Justin Stoller updated an issue
 
Puppet / Task PUP-10513
Do not recursively type check collections when creating iterables in puppet functions
Change By: Justin Stoller
(the method invoked in Puppet functions to create the Iterator)
Add Comment Add Comment
 
This message was sent by Atlassian Jira (v8.5.2#805002-sha1:a66f935)
Atlassian logo

Justin Stoller (Jira)

unread,
May 15, 2020, 6:21:04 PM5/15/20
to puppe...@googlegroups.com
Justin Stoller created an issue
Issue Type: Task Task
Assignee: Justin Stoller
Created: 2020/05/15 3:20 PM
Environment:

When a user calls the Puppet functions `map`, `each`, etc. we create a Puppet Iterator and proxy requests through it to the wrapped Ruby object. When creating a Puppet Iterator in this way we recursively infer the type of the entire collection passed in.

Some times these Iterators are directly exposed to users (via the non-block arities of `reverse_each` and `step`). Unfortunately, where we expose Iterators they may use all of Puppet's type system and may be specialized like "Iterator[String]" or "Iterator[Variant[Hash[String,String],Undef]]". Consequently, to be backwards compatible we will need to recursively check the type for these invocations. (Maybe we can deprecate and remove them in Puppet 7??)

In every other case the Iterator is not exposed to the user. Either the original collection, a new, modified collection, or elements of one of the afore mentioned collections is returned.

The method invoked in Puppet functions to create the Iterators is `Iterable.asserted_iterable`, and as far as I can tell, it does additional type checking that the dispatcher doesn't do and then largely discards the computed type information. We should should probably keep the call to asserted_iterable (assuming the additional type checking is a good thing to do there), but make it non-recursive where possible.

Note, `asserted_iterable` defers to `Iterable.on` to do a lot of this work, `Iterable.on` is used by other aspects of the type system so we might want to keep existing calls to `Iterable.on` backwards compatible with those invocations. (We might also want to look into whether or not those calls are unnecessarily doing recursive type inference.)

Priority: Normal Normal
Reporter: Justin Stoller

Justin Stoller (Jira)

unread,
May 15, 2020, 6:22:03 PM5/15/20
to puppe...@googlegroups.com
Justin Stoller updated an issue
Change By: Justin Stoller
When a user calls the Puppet functions `map`, `each`, etc. we create a Puppet Iterator and proxy requests through it to the wrapped Ruby object. When creating a Puppet Iterator in this way we recursively infer the type of the entire collection passed in.

Some times these Iterators are directly exposed to users ( via the non-block arities of `reverse_each` and `step`). Unfortunately, where we expose Iterators they may use all of Puppet's type system and may be specialized like "Iterator[String]" or "Iterator[Variant[Hash[String,String],Undef]]". Consequently, to be backwards compatible we will need to recursively check the type for these invocations. (Maybe we can deprecate and remove them in Puppet 7??)


In every other case the Iterator is not exposed to the user. Either the original collection, a new, modified collection, or elements of one of the afore mentioned collections is returned.

The method invoked in Puppet functions to create the Iterator Iterators is `Iterable.asserted_iterable`, and as far as I can tell, it does additional type checking that the dispatcher doesn't do and then largely discards the computed type information. We should should probably keep the call to asserted_iterable (assuming the additional type checking is a good thing to do there ) , but make it non-recursive where possible.


Note, `asserted_iterable` defers to `Iterable.on` to do a lot of this work, `Iterable.on` is used by other aspects of the type system so we might want to keep existing calls to `Iterable.on` backwards compatible with those invocations. (We might also want to look into whether or not those calls are unnecessarily doing recursive type inference.)

Justin Stoller (Jira)

unread,
May 15, 2020, 6:22:03 PM5/15/20
to puppe...@googlegroups.com
Justin Stoller updated an issue
Change By: Justin Stoller
Environment:
When a user calls the Puppet functions `map`, `each`, etc. we create a Puppet Iterator and proxy requests through it to the wrapped Ruby object. When creating a Puppet Iterator in this way we recursively infer the type of the entire collection passed in.

Some times these Iterators are directly exposed to users (via the non-block arities of `reverse_each` and `step`). Unfortunately, where we expose Iterators they may use all of Puppet's type system and may be specialized like "Iterator[String]" or "Iterator[Variant[Hash[String,String],Undef]]". Consequently, to be backwards compatible we will need to recursively check the type for these invocations. (Maybe we can deprecate and remove them in Puppet 7??)


In every other case the Iterator is not exposed to the user. Either the original collection, a new, modified collection, or elements of one of the afore mentioned collections is returned.

The method invoked in Puppet functions to create the Iterators is `Iterable.asserted_iterable`, and as far as I can tell, it does additional type checking that the dispatcher doesn't do and then largely discards the computed type information. We should should probably keep the call to asserted_iterable (assuming the additional type checking is a good thing to do there), but make it non-recursive where possible.


Note, `asserted_iterable` defers to `Iterable.on` to do a lot of this work, `Iterable.on` is used by other aspects of the type system so we might want to keep existing calls to `Iterable.on` backwards compatible with those invocations. (We might also want to look into whether or not those calls are unnecessarily doing recursive type inference.)

Henrik Lindberg (Jira)

unread,
May 17, 2020, 6:05:02 AM5/17/20
to puppe...@googlegroups.com
Henrik Lindberg commented on Task PUP-10513
 
Re: Do not recursively type check collections when creating iterables in puppet functions

Maybe it is possible to pass along any constraints to the iterator creator and then attempt to short circuit certain subsequent type matching; i.e. when someone writes:

$myhash.each |$k String, $v Hash| { something() }

It would probably be more lightweight to check if the produced iterator satisfies the constraints that the lambda imposes (must be something producing String/Hash) and then skip the checking on each invocation of the lambda. Likewise, if the lambda given to each imposes no constraints the inference can be skipped.

For the remaining cases, it is perhaps possible to make the inference lazy, waiting until type information is actually needed. In some cases it will never be needed; for example:

$data.reverse_each.map |$x| { "$x" }

There are other opportunities to do type inference optimizations - consider this for example:

function foo($data Array[String]) >> Array[String] {
    return *$data.reverse_each
}

In the function's body it is known that `$data` must be an Array of String. Thus the reverse_each could be made to know that is the case, and the type of the splat (since it is an Array (and not a Tuple with possible different type per slot)) will also produce Array[String], and then it is known that the return is Array[String]. Thus, it is possible to avoid 2 type inferences.

The optimizations would take time to compute so it is a tradeoff against the possible gain.

Josh Cooper (Jira)

unread,
Nov 11, 2021, 2:55:02 PM11/11/21
to puppe...@googlegroups.com
Josh Cooper commented on Task PUP-10513

The case where the block form is called and the block arguments are not specified or are generic (String $k) was fixed in PUP-9561, e.g.

$myhash.each |$k, $v| { ... }

So instead of inferring the type of the entire Hash and then calling Type.callable?(args), we call Type.callable_with?(args), which Thomas added in PUP-7301. In the common case where the type isn't specified $k, then the check is very quick.

This doesn't fix/handle expensive type inference in asserted_iterable but maybe a similar approach could be taken there?

This message was sent by Atlassian Jira (v8.13.2#813002-sha1:c495a97)
Atlassian logo

Josh Cooper (Jira)

unread,
Mar 8, 2022, 2:03:02 PM3/8/22
to puppe...@googlegroups.com
Josh Cooper updated an issue
 
Change By: Josh Cooper
Team: Froyo
This message was sent by Atlassian Jira (v8.20.2#820002-sha1:829506d)
Atlassian logo

Josh Cooper (Jira)

unread,
Mar 8, 2022, 4:58:01 PM3/8/22
to puppe...@googlegroups.com
Josh Cooper commented on Task PUP-10513
 
Re: Do not recursively type check collections when creating iterables in puppet functions

Hey justin where did you leave off with this? Did you have the start of a PR?

Justin Stoller (Jira)

unread,
Mar 8, 2022, 5:31:01 PM3/8/22
to puppe...@googlegroups.com

Justin Stoller (Jira)

unread,
Mar 8, 2022, 5:32:02 PM3/8/22
to puppe...@googlegroups.com
Justin Stoller updated an issue
 
Change By: Justin Stoller
Fix Version/s: PUP 5.5.22
Fix Version/s: PUP 6.18.0
Reply all
Reply to author
Forward
0 new messages