IDictionary.Contains

Patrick Earl

unread,

Feb 19, 2011, 1:32:40 AM2/19/11

to re-motion Users

NHibernate handles IDictionary.Contains differently than something
like IEnumerable.Contains. It needs to differentiate between
searching the key column and searching the value column. The latest
re-linq changes group IDictionary.Contains and IEnumerable.Contains
into the ContainsResultOperator. Does anyone have suggestions on a
convenient way to avoid this newly created conflict? I have pondered
trying to make re-linq skip IDictionary and trying to figure out if
something is a dictionary in the result operator processor. I'd love
to hear anyone's thoughts on doing this in a clean way. Should
IDictionary possibly be an exception to the normal behavior in the re-
linq code?

Thanks for your ideas.

Patrick Earl

Michael Ketting

unread,

Feb 19, 2011, 3:09:16 AM2/19/11

to re-moti...@googlegroups.com

Hi Patrick!

I'll have to leave it to Fabian to go into the gritty details, but I do remember when we had this discussion. Sort of. Here's the blog post Fabian wrote about the subject: https://www.re-motion.org/blogs/mix/archive/2010/09/23.aspx (At least, I'm 95% sure, that's the same issue)

Anyway, I believe the answer lies in the workarounds posted at the end of the post. Possibly here:

You can allow your users to register parsers for their own query operators (MyList<T>.Contains()) by exposing the MethodCallExpressionNodeTypeRegistry.

Michael

Wenig, Stefan

unread,

Feb 19, 2011, 6:47:33 AM2/19/11

to re-moti...@googlegroups.com

Here's my 2 cents:

The original idea was to kind of normalize all kinds of Contains() calls to the same result operator, so that a provider would not have to account for all kinds of different calls and methods (static extension, interface, class instance methods, different collection types). The underlying assumption was that eventually it would all result in the same code anyway, because whatever the subtleties in C# are, there's not a lot of different semantics that "contains" could have in any given backend (like SQL).

So the question is, IMHO: is that assumption true, but there are different strategies depending on the collection type? then maybe just looking at that type in the result operator processor, as you suggested, is the way to go. This might lead to trouble in a scenario with a collection type that implements both ICollection and IEnumerable, and you want to make a distinction between Enumerable.Contains(collection,item) and collection.Contains (key).
Or do you actually need to make that method-definition-based distinction? Then maybe there should be a ContainsKeyResultOperator registered for ICollection.Contains. I'll leave it to Fabian to comment on wheter this should go into re-linq or NH then.

HTH,
Stefan

From: re-moti...@googlegroups.com [re-moti...@googlegroups.com] on behalf of Michael Ketting [michael...@rubicon.eu]
Sent: Saturday, February 19, 2011 09:09
To: re-moti...@googlegroups.com
Subject: [re-motion-users] Re: IDictionary.Contains

--
You received this message because you are subscribed to the Google Groups "re-motion Users" group.
To post to this group, send email to re-moti...@googlegroups.com.
To unsubscribe from this group, send email to re-motion-use...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/re-motion-users?hl=en.

Patrick Earl

unread,

Feb 19, 2011, 10:55:28 AM2/19/11

to re-motion Users

I thought it would be okay to figure out the type in the
ContainsResultOperator, but after looking for a while, I couldn't even
figure out how to do this without following some very complicated
route. I'd be happy to have some practical advice in that area if
that's the best way to do it.

Patrick Earl

On Feb 19, 4:47 am, "Wenig, Stefan" <stefan.we...@rubicon.eu> wrote:
> Here's my 2 cents:
>
> The original idea was to kind of normalize all kinds of Contains() calls to the same result operator, so that a provider would not have to account for all kinds of different calls and methods (static extension, interface, class instance methods, different collection types). The underlying assumption was that eventually it would all result in the same code anyway, because whatever the subtleties in C# are, there's not a lot of different semantics that "contains" could have in any given backend (like SQL).
>
> So the question is, IMHO: is that assumption true, but there are different strategies depending on the collection type? then maybe just looking at that type in the result operator processor, as you suggested, is the way to go. This might lead to trouble in a scenario with a collection type that implements both ICollection and IEnumerable, and you want to make a distinction between Enumerable.Contains(collection,item) and collection.Contains (key).
> Or do you actually need to make that method-definition-based distinction? Then maybe there should be a ContainsKeyResultOperator registered for ICollection.Contains. I'll leave it to Fabian to comment on wheter this should go into re-linq or NH then.
>
> HTH,
> Stefan
> ________________________________

> From: re-moti...@googlegroups.com [re-moti...@googlegroups.com] on behalf of Michael Ketting [michael.kett...@rubicon.eu]

> Sent: Saturday, February 19, 2011 09:09
> To: re-moti...@googlegroups.com
> Subject: [re-motion-users] Re: IDictionary.Contains
>
> Hi Patrick!
>

> I'll have to leave it to Fabian to go into the gritty details, but I do remember when we had this discussion. Sort of. Here's the blog post Fabian wrote about the subject:https://www.re-motion.org/blogs/mix/archive/2010/09/23.aspx(At least, I'm 95% sure, that's the same issue)

Patrick Earl

unread,

Feb 20, 2011, 12:27:50 AM2/20/11

to re-motion Users

I got impatient and solved it by creating an INodeTypeProvider that
returns false when IsRegistered is called for IDictionary.Contains.
NHibernate is now using .93. If there are any better suggestions, I'd
be happy to hear them.

I'm fairly happy with the new version since it simplifies and fixes a
few things. Thanks for the work on that!

The code for implementing the Aggregate operator is a bit more
complicated now. I have a couple questions.

1. Are you aware of any providers that actually implement the
Aggregate operator at the database level?

2. Is there a simpler way to perform this result operator processing?

public void Process(AggregateResultOperator resultOperator,
QueryModelVisitor queryModelVisitor, IntermediateHqlTree tree)
{
var inputExpr =
((StreamedSequenceInfo)queryModelVisitor.PreviousEvaluationType).ItemExpression;
var inputType = inputExpr.Type;
var paramExpr = Expression.Parameter(inputType, "item");
var accumulatorFunc = Expression.Lambda(
ReplacingExpressionTreeVisitor.Replace(inputExpr, paramExpr,
resultOperator.Func.Body),
resultOperator.Func.Parameters[0],
paramExpr);

var inputList =
Expression.Parameter(typeof(IEnumerable<>).MakeGenericType(typeof(object)),
"inputList");

var castToItem = EnumerableHelper.GetMethod("Cast", new[]
{ typeof(IEnumerable) }, new[] { inputType });
var castToItemExpr = Expression.Call(castToItem, inputList);

var aggregate = ReflectionHelper.GetMethodDefinition(() =>
Enumerable.Aggregate<object>(null, null));
aggregate =
aggregate.GetGenericMethodDefinition().MakeGenericMethod(inputType);

MethodCallExpression call = Expression.Call(
aggregate,
castToItemExpr,
accumulatorFunc
);

tree.AddListTransformer(Expression.Lambda(call, inputList));
}

Appreciate any thoughts.

Patrick Earl

On Feb 19, 8:55 am, Patrick Earl <hyn...@gmail.com> wrote:
> I thought it would be okay to figure out the type in the
> ContainsResultOperator, but after looking for a while, I couldn't even
> figure out how to do this without following some very complicated
> route. I'd be happy to have some practical advice in that area if
> that's the best way to do it.
>
> Patrick Earl
>
> On Feb 19, 4:47 am, "Wenig, Stefan" <stefan.we...@rubicon.eu> wrote:
>
>
>
>
>
>
>
> > Here's my 2 cents:
>
> > The original idea was to kind of normalize all kinds of Contains() calls to the same result operator, so that a provider would not have to account for all kinds of different calls and methods (static extension, interface, class instance methods, different collection types). The underlying assumption was that eventually it would all result in the same code anyway, because whatever the subtleties in C# are, there's not a lot of different semantics that "contains" could have in any given backend (like SQL).
>
> > So the question is, IMHO: is that assumption true, but there are different strategies depending on the collection type? then maybe just looking at that type in the result operator processor, as you suggested, is the way to go. This might lead to trouble in a scenario with a collection type that implements both ICollection and IEnumerable, and you want to make a distinction between Enumerable.Contains(collection,item) and collection.Contains (key).
> > Or do you actually need to make that method-definition-based distinction? Then maybe there should be a ContainsKeyResultOperator registered for ICollection.Contains. I'll leave it to Fabian to comment on wheter this should go into re-linq or NH then.
>
> > HTH,
> > Stefan
> > ________________________________
> > From: re-moti...@googlegroups.com [re-moti...@googlegroups.com] on behalf of Michael Ketting [michael.kett...@rubicon.eu]
> > Sent: Saturday, February 19, 2011 09:09
> > To: re-moti...@googlegroups.com
> > Subject: [re-motion-users] Re: IDictionary.Contains
>
> > Hi Patrick!
>

> > I'll have to leave it to Fabian to go into the gritty details, but I do remember when we had this discussion. Sort of. Here's the blog post Fabian wrote about the subject:https://www.re-motion.org/blogs/mix/archive/2010/09/23.aspx(Atleast, I'm 95% sure, that's the same issue)

Fabian Schmied

unread,

Feb 21, 2011, 10:03:24 AM2/21/11

to re-moti...@googlegroups.com

Hi Patrick,

I think you were right in your initial post. As Stefan said, the idea
was to be able to translate all calls to Contains methods on
IEnumerables to a single result operator, since those usually have the
same semantics. String.Contains was already exempt (as this has a
different semantics), and IDictionary.Contains (which is really a
ContainsKey) should be exempt as well. I've created an issue here:
"https://www.re-motion.org/jira/browse/RM-3753".

The problem is, of course, inherent with our current detection of
"Contains-like" result operators. (Which was, btw, requested by
NHibernate.) The ContainsExpressionNode parser handles every method
called "Contains" that has only one argument (in addition to the
source sequence) and is declared on an IEnumerable.
IDictionary.Contains is one example of how this "fuzzy matching" can
go wrong.

> I got impatient and solved it by creating an INodeTypeProvider that
> returns false when IsRegistered is called for IDictionary.Contains.
> NHibernate is now using .93. If there are any better suggestions, I'd
> be happy to hear them.

This is similar to what I'd have suggested in this case.

If I implemented this in re-linq, I would add a
ContainsKeyResultOperator and a ContainsKeyExpressionNode parser (that
creates the ContainsKeyResultOperator). This has the advantage that it
also works at the top level of a query (whereas leaving it a
MethodCallExpression only works if the call to ContainsKey is embedded
within a query).

E.g., Orders.Items.Contains (12); // where Orders.Items is IDictionary

Without an expression node parser for the Contains method, this query
will fail to work with re-linq.
However, this might not be an issue for you if you only need those
Contains methods to work in where conditions and similar places.

(BTW, I wonder how you're detecting calls to IDictionary.Contains -
it's not easy to infer whether a MethodInfo is an implementation of a
specific interface method. So, it's difficult to get reflection code
to handle all implementations of IDictionary.Contains the same way.)

> The code for implementing the Aggregate operator is a bit more
> complicated now. I have a couple questions.
>
> 1. Are you aware of any providers that actually implement the
> Aggregate operator at the database level?

No. Not even Linq to SQL does that. (Haven't checked Linq to Entities, though.)

> 2. Is there a simpler way to perform this result operator processing?

Let me summarize what your code does:

1 - You're retransforming the accumulating Lambda back to the Func the
user specified.
2 - You build an Aggregate method call expression that mimics what the
user typed originally, but using Enumerable.Aggregate (not
Queryable.Aggregate).
3 - You add that method call expression as a "list transformer" to the HQL tree.

In other words, the first two steps effectively undo the
simplification the re-linq parser made.

I'm assuming for now that a list transformer is a LambdaExpression
that is executed on the result of an HQL query in-memory. (If not, I
wonder if undoing re-linq's simplifications really helps. After all,
the expression would have to be analyzed again in another place.)

If I'm right about the in-memory part, this is very similar to what
AggregateResultOperator.ExecuteInMemory does. Check out the code here:
"https://svn.re-motion.org/svn/Remotion/trunk/Remotion/Data/Linq/Clauses/ResultOperators/AggregateResultOperator.cs".
As you can see, the code uses the
ReverseResolvingExpressionTreeVisitor instead of the
ReplacingExpressionTreeVisitor; this simplifies step 1 a bit.

If you only execute Aggregate in-memory, there'd be another option:
replace the AggregateResultOperator with your own version that keeps
the original aggregating func.
This means:
- Add an AggregateInMemoryResultOperator class.
- Add an AggregateInMemoryExpressionNode parser class that simply
creates an instance of AggregateInMemoryResultOperator.
- Where you create your customized QueryParser, register your custom
node type (see "https://www.re-motion.org/jira/browse/RM-3721") for
the Enumerable/Queryable.Aggregate methods.

Then, you could avoid step 1 entirely.

About step 2 - building the Aggregate method call - I fear, you can't
really get rid of that. You can't reuse the expression from the
original LINQ query (it has the wrong source sequence, might use
Queryable.Aggregate, etc.). You could provide a few helper methods to
aid with generation of expressions, but I can't think of much more
simplification. (Maybe you can get rid of the Cast by calling a
different closed version of the generic Aggregate method. I.e.,
Aggregate<T> - where T is the input item type - rather than
Aggregate<object>. But that depends on how the list transformer works;
i.e., if it gives an IEnumerable<T> or only an IEnumerable.)

Regards,
Fabian

Fabian Schmied

unread,

Oct 11, 2012, 4:33:42 AM10/11/12

to re-moti...@googlegroups.com

Hi,

> I think you were right in your initial post. As Stefan said, the idea
> was to be able to translate all calls to Contains methods on
> IEnumerables to a single result operator, since those usually have the
> same semantics. String.Contains was already exempt (as this has a
> different semantics), and IDictionary.Contains (which is really a
> ContainsKey) should be exempt as well. I've created an issue here:
> "https://www.re-motion.org/jira/browse/RM-3753".

For everyone's information: I've now fixed this issue by excluding
types implementing IDictionary from the heuristics deciding whether a
method called "Contains" corresponds to a ContainsResultOperator.

As a result, when somebody uses IDictionary.Contains,
Hashtable.Contains, etc. within a LINQ query, re-linq will not
interpret that method as a query operator, but leave the
MethodCallExpression in the QueryModel, starting from re-linq
1.13.170. (Enumerable/Queryable.Contains still get interpreted as
query operators, even on types implementing IDictionary.)

Best regards,
Fabian

Reply all

Reply to author

Forward