QueryModel Transform Linq?

117 views
Skip to first unread message

yjinglee

unread,
Oct 15, 2012, 5:55:18 AM10/15/12
to re-moti...@googlegroups.com
I think shards db query.
I need db query results do linq2object operation in memory , so I want to get the need to group expressions or result operator from QueryModel object.

for example,
query:
            var q =
                from p in DB.Products
                group p by p.CategoryID into g
                select new
                {
                    g.Key,
                    MaxPrice = g.Max(p => p.UnitPrice)
                };

to search multiple database. modify the ExecuteCollection method:

        public IEnumerable<T> ExecuteCollection<T>(QueryModel queryModel)
        {
            ArgumentUtility.CheckNotNull("queryModel", queryModel);
            var commandData = GenerateSqlCommand(queryModel);
            var projection = commandData.GetInMemoryProjection<T>().Compile();
            var results = new List<T>();
            foreach (var connectionString in _connectionStrings)
            {
                results.AddRange(_resultRetriever.GetResults(projection, connectionString, commandData.CommandText, commandData.Parameters));
            }
            return results.AsQueryable()......;
        }

can I get this group expressions or result operator from QueryModel object, for example, this query"
 group p by p.CategoryID into g
                select new
                {
                    g.Key,
                    MaxPrice = g.Max(p => p.UnitPrice)
                } "
 to do linqtoobject in memory after return. 

Fabian Schmied

unread,
Oct 15, 2012, 7:40:06 AM10/15/12
to re-moti...@googlegroups.com
Hi,

> I think shards db query.
> I need db query results do linq2object operation in memory , so I want to
> get the need to group expressions or result operator from QueryModel object.
>
> for example,
> query:
> var q =
> from p in DB.Products
> group p by p.CategoryID into g
> select new
> {
> g.Key,
> MaxPrice = g.Max(p => p.UnitPrice)
> };
>

[,,,]

> can I get this group expressions or result operator from QueryModel object,

[...]

re-linq represents the sample query you've given as follows:

from g in
(
from p in DB.Products
select [p])
.GroupBy ([p].CategoryID, [p])
)
select new
{
[g].Key,
MaxPrice = (from p in [g] select [p].UnitPrice).Max()
};

(This normalized QueryModel is equivalent to your original query.)

If I understand you correctly, you want to execute the "from p in
DB.Products select [p]" part within a set of databases, but everything
else in memory, right?
Unfortunately, re-linq does not provide this out of the box, you'll
have to do a lot of this yourself. I'd suggest the following approach:

- When your ExecuteCollection method is invoked, you first need to use
a combination of QueryModelVisitors and ExpressionTreeVisitors to find
those parts of the QueryModel that you can execute - "from p in
DB.Products select [p]", for example.
- Execute those parts agains your data source(s), then replace the
parts you've just executed with the results. In your example, this
leaves the following QueryModel:

from g in
(
from p in <COLLECTION>
select [p])
.GroupBy ([p].CategoryID, [p])
)
select new
{
[g].Key,
MaxPrice = (from p in [g] select [p].UnitPrice).Max()
};

- Then you can either:
-- interpret the remaining QueryModel; i.e., using a visitor, go
through the QueryModel, and filter/order/project the data according to
the clauses you find. For result operators, you can use the
ExecuteInMemory method, but clauses you'll have to interpret yourself.
Or,
-- translate the remaining QueryModel into a single LambdaExpression
representing a LINQ to Objects query, compile it, and run it.

As I said, re-linq has no support for this out of the box, so you'll
have to build it yourself. However, I'd be interested in your
progress, so please keep us up to date on this list. And of course,
you can ask here if you have specific problems with re-linq.

BTW, Akexandr Nikitin recently described a similar problem, see here:
"https://groups.google.com/d/topic/re-motion-users/A4rumhpYdF8/discussion".
If he has already solved his problem, you might be able to use a
similar approach as he has.

Best regards,
Fabian
> --
> You received this message because you are subscribed to the Google Groups
> "re-motion Users" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/re-motion-users/-/1xBWoY2IiM4J.
> To post to this group, send email to re-moti...@googlegroups.com.
> To unsubscribe from this group, send email to
> re-motion-use...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/re-motion-users?hl=en.

yjinglee

unread,
Oct 15, 2012, 11:49:19 PM10/15/12
to re-moti...@googlegroups.com
Thanks Fabian 

在 2012年10月15日星期一UTC+8下午7时40分27秒,Fabian Schmied写道:

yjinglee

unread,
Oct 16, 2012, 6:06:54 AM10/16/12
to re-moti...@googlegroups.com
Hi Fabian

I mean in multiple database execute 
 var q = 
                 from p in DB.Products 
                 group p by p.CategoryID into g 
                 select new 
                 { 
                     g.Key, 
                     MaxPrice = g.Max(p => p.UnitPrice) 
                 }; 
then merge multiple results collections
then in memory execute too 

 var q = 
                 from p in results

                 group p by p.CategoryID into g 
                 select new 
                 { 
                     g.Key, 
                     MaxPrice = g.Max(p => p.UnitPrice) 
                 }; 

how translate the remaining QueryModel into a single LambdaExpression using QueryModel .TransformExpressions method?

        private IEnumerable<T> ExecuteInMemory<T>(IEnumerable<T> entities, QueryModel queryModel)
        {
            //entities do linq to object
            queryModel.TransformExpressions(exp=>exp);

            return entities;
        }
can you write some pseudo-code example generate groupby LambdaExpression in memory execute

Alexander I. Zaytsev

unread,
Oct 16, 2012, 6:23:31 AM10/16/12
to re-moti...@googlegroups.com
I think ReverseResolvingExpressionTreeVisitor does what you need. But I might be wrong.

--
You received this message because you are subscribed to the Google Groups "re-motion Users" group.
To view this discussion on the web visit https://groups.google.com/d/msg/re-motion-users/-/mMckY2XJBy4J.

Fabian Schmied

unread,
Oct 16, 2012, 9:58:09 AM10/16/12
to re-moti...@googlegroups.com
Hi,

[...]

> then in memory execute too
>
> var q =
> from p in results
>
> group p by p.CategoryID into g
> select new
> {
> g.Key,
> MaxPrice = g.Max(p => p.UnitPrice)
> };
>
> how translate the remaining QueryModel into a single LambdaExpression using
> QueryModel .TransformExpressions method?

You cannot make a single LambdaExpression using
QueryModel.TransformExpressions. You need to build it yourself, using
a complex algorithm that constructs a big Expression bit by bit by
calling factory methods such as Expression.Call. Your algorithm needs
to detect nested QueryModels and translate them into Expressions
recursively. What regards the expressions embedded within the
QueryModel; as Alexander said, you can use the
ReverseResolvingExpressionTreeVisitor to get LambdaExpressions for
those.

This is a really complex problem, and I can't give you a full
algorithm, but I'll explain the steps for your specific example.

In your sample, there are three QueryModels:
a - there is a nested QueryModel: (from p in results select
[p]).GroupBy (KeySelector = [p].CategoryID, ElementSelector = [p])
b - there is a second nested QueryModel: (from p in [g] select
[p].UnitPrice).Max()
c - there is an outer QueryModel: (from g in (QueryModel a) select new
{ [g].Key, MaxPrice = (QueryModel b) }

You need to deal with each of them separately, going from the
MainFromClause over the BodyClauses and the SelectClause to the result
operators. Each QueryModel will be translated into an Expression; the
Expression for QueryModel c will contain those of QueryModel a and b.
While translating QueryModels into Expressions, you should use a
StreamedSequenceInfo object to keep track of the data type of the
items at the current position within the QueryModel.

Let's start with QueryModel c, as it is the outer-most QueryModel:
queryModelC == (from g in (QueryModel a) select new { [g].Key,
MaxPrice = (QueryModel b) }.

- The queryModelC.MainFromClause.FromExpression contains a
SubQueryExpression, which is holding QueryModel a.

So let's deal with the nested QueryModel a first and come back to
QueryModel c later: queryModelA == (from p in results select
[p]).GroupBy (KeySelector = [p].CategoryID, ElementSelector = [p]).

- QueryModelA.MainFromClause.FromExpression has a ConstantExpression
which represents the reslts you got from the database. Take this
expression and put it into an Expression variable, let's call it
queryExpr. Also construct a sequence info object to keep track of the
data types and expressions: "new StreamedSequenceInfo (typeof
(IEnumerable<>).MakeGenericType (mainFromClause.ItemType), new
QuerySourceReference (mainFromClause))"; let's call it sequenceInfo.
- QueryModelA.BodyClauses is empty, so queryExpr and sequenceInfo stay the same.
- The queryModelA.SelectClause is trivial, so queryExpr and
sequenceInfo stay the same.
- Now comes the GroupResultOperator. Use the
ReverseResolvingExpressionTreeVisitor to get a LambdaExpression for
the KeySelector and the ElementSelector. Pass in the
sequenceInfo.ItemExpression, then the LambdaExpressions should be
constructed correctly.
- Now construct a call to the Enumerable.GroupBy<TSource, TKey,
TElement> method. GroupBy takes the incoming query as its first
argument, then the key selector lambda, then the element selector
lambda: Expression.Call (groupByMethod.MakeGenericMethod
(sequenceInfo.ResultItemType, groupResultOperator.KeySelector.Type,
groupResultOperator.ElementSelector.Type), keySelectorLambda,
elementSelectorLambda). Store the result in queryExpr.
- Update the sequenceInfo: sequenceInfo = (StreamedSequenceInfo)
groupResultOperator.GetOutputDataInfo (sequenceInfo).

- You now have an Expression (queryExpr) that represents the inner
QueryModel a, and a StreamedSequenceInfo (sequenceInfo) that describes
the items.

Let's go back to QueryModel c: queryModelC == (from g in <expression
built from QueryModel A> select new { [g].Key, MaxPrice = (QueryModel
b) }.

- We were at the MainFromClause of QueryModel c. Once again, we
initialize queryExpr and sequenceInfo variables. The queryExpr is the
Expression that we got from translating QueryModel a. The sequenceInfo
is again: "new StreamedSequenceInfo (typeof
(IEnumerable<>).MakeGenericType (mainFromClause.ItemType), new
QuerySourceReference (mainFromClause))".
- queryModelC.BodyClauses is empty, so queryExpr and sequenceInfo stay the same.
- queryModelC.SelectClause has a complex Selector expression. First,
we need to handle the embedded QueryModel b, then we need to transform
the Selector into a LambdaExpression.

Okay, so now we can handle QueryModel b: queryModelB = (from p in [g]
select [p].UnitPrice).Max().

- Once again, construct a (new) queryExpr and a sequenceInfo, like above.
- This time, the SelectClause is not trivial, so we need to update
queryExpr and sequenceInfo.
- Build a LambdaExpression from the SelectClause.Selector by using the
ReverseResolvingExpressionTreeVisitor again. Use
sequenceInfo.ItemExpression as the input, and you'll get back a
LambdaExpression.
- Construct a call to Enumerable.Select<TSource, TResult> using
Expression.Call (...), like above. It takes the incoming query, then
the selector lambda. Use the LambdaExpression just constructed. Store
the result within queryExpr.
- Update the sequenceInfo by calling SelectClause.GetOutputDataInfo().
- Now comes the MaxResultOperator. Once again, use Expression.Call to
call the Enumerable.Max<...> method. Pass in the queryExpr.

Return to QueryModel c again: queryModelC == (from g in <expression
built from QueryModel A> select new { [g].Key, MaxPrice = <expression
built from QueryModel B> }.

- Use ReverseResolvingExpressionTreeVisitor to get a LambdaExpression
from the SelectClause's selector. (Pass in your current
sequenceInfo.ItemExpression for QueryModel C.)
- Construct a call to Enumerable.Select using Expression.Call, as
above. Pass in the reverse resolved LambdaExpression. Store the result
within queryExpr and update the sequenceInfo by using
SelectClause.GetOutputDataInfo().

And now, you're done. You have an Expression representing the query.
Now wrap it into a LambdaExpression without parameters, compile it,
and execute it.

As I said, it is a complex algorithm, but I think you now have all the
information needed to implement it.

Best regards,
Fabian
Reply all
Reply to author
Forward
0 new messages