I fail to see why the linq provider creates the second query
(@p0+'%').
The reason is that the StartsWith / EndsWith/Contains calls on
'string' are easy to deal with and you can just formulate a like query with
a pattern, no need for parameter concatenation (i.e.: the parameter value IS
the pattern).
I.o.w. a useless restriction (and IMHO a legitimate bugreport).
FB
Why? .StartsWith() is a bool returning method usable in a predicate.
I have no idea why I should add '.Should()':
var q = from c in session.Linq<Customer>()
where c.CompanyName.StartsWith("Foo")
select c;
this is a simple, legitimate query which should result in a simple
select on customer with a LIKE predicate having the pattern "Foo%".
> but when you have to translate it for RDBMS you have to fit the mismatch
> translating it to a like 'some%'
that's not a mismatch, it exactly matches the rows you ordered it to
match.
> and ...
> var myPrefix = "so";
> "something".Should().StartsWith(myPrefix + "m");
>
> should be translated to
> like @prefix || 'm' || '%'
why should that translate to that?
myPrefix is an in-memory constant and "m" is a constant. This is
'funcletized' away (at least it _should_ by the linq provider) into a lambda
which is compiled into a delegate (that's 1 call) and ran in-memory at the
spot, resulting in "som". That's the constant then used in StartsWith().
If you don't scan for in-memory constructs and compile them into
delegates, you can't handle things like:
var q = from c in session.Linq<Customer>()
where c.CompanyName.StartsWith(GetCompanyStartFragment())
select c;
exactly the same thing.
Now, why does the o/r mapper core NEED to have the '%' been split
from the actual pattern?
FB
I'd expect something like this (TSQL):
x.StartsWith (y + z)
=>
... WHERE x LIKE ESCAPE_WILDCARDS(y + z) + '%' ESCAPE '\'
Where ESCAPE_WILDCARDS would be defined as:
REPLACE (REPLACE (REPLACE (REPLACE (REPLACE(@str, '\', '\\'), '%', '\%'), '_', '\_'), '[', '\['), ']', '\]')
If we can't do that, we better limit ourselves to the capabilities of the old provider and do constant escaping in memory. Better simple and correct than powerful and wrong, not? Non-constant patterns for LIKE are a rare sight anyway.
What I wrote is just the SQL, but the new provider never generates SQL, just HQL. I have no idea whether this is even possible in HQL. (I read that ESCAPE is available in Java-Hibernate since v3, no idea how ESCAPE_WILDCARDS could be done in HQL. BTW, I still think we should have access to the interim HQL when discussing LINQ2NH.)
Partial evaluation is a good point. Fortunately, re-linq already does partial evaluation on expressions before LINQ2NH even sees them. Unfortunately, the +'%' operation is not even there at that time. So you could consider replacing the StartsWith(x) with some Like(x.EscapeWildcards() + '%') expression in the QueryModel and then calling the partial evaluator of re-linq again. Should work, meet us on our users list if you want to try that.
Last but not least, I'd like to say that I'd be happy if we could discuss LINQ issues without a constant debate whether something is really necessary or not. If Steve doesn't want to do it, fine. If nobody wants to send a patch either, the issue should be closed. But as long as there are people interested in fixing or implementing stuff, why close an issue? And Fabio, the constant lament about NH not needing more users, and LINQ just being the fifth wheel on a car with lots of query capabilities already, that's just frustrating and helps no one. We've all put a lot of work into the new LINQ provider, please don't treat it like that just because it's not your favorite feature. If nobody wants to fix it and people still complain, there's still enough time to get all defensive.
Thanks,
Stefan
My thanks to you. Everyone here appreciates your relentless struggle to improve NH. Just had to get that one off my chest!
Love,
Stefan :-)
Fabio,
there was no hidden criticism in my last mail, so I’m not sure I get what you’re saying. But you’re right, I don’t see what’s going on behind the scenes. I only see what’s on the list, so that’s all I can relate to. Same is true for other people I guess, like Frans.
If the dev list is a cozy place for people like us to hang out, the entire NH community might benefit from that. If you give a wrong impression here, it will be carried anywhere. I sure don’t want the community to think that the new LINQ provider is incomplete and nobody wants to fix it. Even if that would not reflect the discussions here correctly.
(If I create too much noise for someone who does not contribute to NH directly, just tell me. But now that NH3 with a real LINQ provider is about to be released, I think it won’t hurt to have some re-linq team members lurking on your list. I’ll try to help if I can.)
Cheers,
Stefan
I understand that, but I fail to see how that fits in with our little
discussion. This thread finds me confused.
Anyway, I’m happy if you just leave some room to deal with issues in LINQ,
even if they are edge cases, and give our beautiful new LINQ provider some
unconditional love. Every newborn deserves that, no matter how great its
siblings are!
Good point you brought up here. I can imagine escaping is the reason
why the '%' is separated (I asked a question about this, but it's not
answered) along the way, so you can do simple escaping without running the
risk of escaping the '%' character as well. The problem is though that the
'%' is separated in the _query_, which is odd, as I assume the specific AST
part, namely the LIKE expression part, is handled by a method which only
emits like fragments, and thus knows how to append the '%' after it produces
the escape line.
Not all databases support the same escaping btw (or at all), so this
might be a dialect specific feature.
> Last but not least, I'd like to say that I'd be happy if we could discuss
> LINQ issues without a constant debate whether something is really
necessary
> or not. If Steve doesn't want to do it, fine. If nobody wants to send a
> patch either, the issue should be closed. But as long as there are people
> interested in fixing or implementing stuff, why close an issue? And Fabio,
> the constant lament about NH not needing more users, and LINQ just being
the
> fifth wheel on a car with lots of query capabilities already, that's just
> frustrating and helps no one. We've all put a lot of work into the new
LINQ
> provider, please don't treat it like that just because it's not your
> favorite feature. If nobody wants to fix it and people still complain,
> there's still enough time to get all defensive.
Not only that, but this mailing list is actually completely useless
with respect to linq to nhibernate. I wrote a lengthy reply on the 18th:
http://groups.google.com/group/nhibernate-development/msg/fc8423fc16a80773
and what has been done with it, besides some mocking... nothing. If people
don't want help every now and then from someone who already has written a
linq provider, just say so, and I'll spend my time on something else.
If debates about linq are largely held off-list, posting here is
more of a 'side-line debate' instead of a debate of any value.
FB
Too many assumptions for me to follow up on, someone from the NH team would have to weigh in here (if they want).
For me, it would be much easier to follow this if I could see the interim HQL. As things are now, I often cannot tell whether something is rooted in LINQ to HQL or in HQL to SQL.
I just asked Steve, he said he could do it quite easily (but didn't say if he actually will ;-))
>
> Not all databases support the same escaping btw (or at all), so
> this
> might be a dialect specific feature.
I believe LIKE '100\%%' ESCAPE '\' to be ANSI SQL. Differences exist of course, such as the regex-like [...] in TSQL.
http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt
8.5 <like predicate>
Function
Specify a pattern-match comparison.
Format
<like predicate> ::=
<match value> [ NOT ] LIKE <pattern>
[ ESCAPE <escape character> ]
I think it depends how Linq to NH is converted to the AST and what
the AST is. If the AST is already in SQL ready format, it's essential to
solve these problems twice, in HQL and in Linq, which is IMHO a bit odd, but
it then requires an AST which is more richer of nature but harder to convert
to SQL.
> I just asked Steve, he said he could do it quite easily (but didn't say if
> he actually will ;-))
>
> >
> > Not all databases support the same escaping btw (or at all), so this
> > might be a dialect specific feature.
>
> I believe LIKE '100\%%' ESCAPE '\' to be ANSI SQL. Differences exist of
> course, such as the regex-like [...] in TSQL.
>
> http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt
>
> 8.5 <like predicate>
>
> Function
>
> Specify a pattern-match comparison.
>
> Format
>
> <like predicate> ::=
> <match value> [ NOT ] LIKE <pattern>
> [ ESCAPE <escape character> ]
Sadly what's in the standard isn't in the databases out there. For
example T-SQL supports more wildcards than DB2 or Oracle for example, so
escaping [] is not useful on oracle (as you don't need to).
FB
If you want a solution for that, just ask. (and whether when it's a
good idea or not). However till now, what I've read here we just have to
wait and not say anything. Fine by me, but you and your framework aren't
helped by that IMHO.
FB
well, NH is 'owned' by the people who can change it, i.e. the people
who are allowed to make changes, thus the team. Like the group who decided
to close 2254 and the group who can't answer simple questions on the
official mailing list about why the '%' is appended in an expression instead
of inside the parameter value itself.
> I was talking about a Team and not a person and, btw, in my world any OSS
> project is owned by the cloud.
usage rights maybe, but if 'the cloud' owned the OSS project this
discussion wouldn't be necessary. However, it IS necessary because someone
(you) decided to close a bug. While that's perhaps totally right, you
decided it was perhaps necessary to ask for a discussion here. What's
amusing to see is that the people who actively participate in technical
discussions about the matter are not people who contribute, i.e. 'the team'.
The team itself is silent.
Writing a linq provider takes a lot of dedication and focus, you
can't just 'jump in and provide a patch', that's not going anywhere. You
have to design the thing from start to finish, write tools to help you
develop the thing (like visualizers which view the various stages of the
expression tree), and implement the various stages. You however seem to
think it's a matter of 'someone will come up with a patch for <feature> /
<bug>'. No way in hell that that's going to work for the linq provider.
I thought you were the project lead of NHibernate? If you are, stop
bickering here and start forming a team with dedicated people who know wtf
it is all about to write a linq provider. I am here on this list to see if I
can help, but my experiences so far are below expectations: as if you don't
need my help. But, Fabio, I already solved the problems of your linq
provider 3 years ago. If you don't want my help or the help of other
seasoned linq provider writers, fine by me, but don't think it will
eventually come along fine, it won't.
Unless Steve S. manages to spend more time on the project, but the
poor guy is already swamped with work, it seems.
> See you to the next coming soon feature.
> http://216.121.112.228/browse/NH-2256
Ah yes, sounds very familiar, with one exception: here, the mistake
is made to use 1 registry.
You need 1 per dialect. In there you register .NET methods, per
type, per # of parameters and store a fragment (or whatever you use to
convert to SQL) which is able to convert the function to SQL for that
dialect/db. Per dialect you pre-define a list of DB functions, like CAST,
CONVERT, ABS, CASE and what not.
Then the same mechanism / class can be used by NH users to define
their OWN .net method to SQL fragment mapping. The user then creates such a
class, and passes it to the session when a linq query is formulated. This
user provided registry overrides the one in the dialect which is active in
the session, if a method is mapped twice.
In the MethodCall expression handler early on you look up the
fragment to convert the function to and convert it to your own expression
with the fragment.
That's it. Works very simple and flexible. Usage example:
http://www.llblgen.com/documentation/3.0/LLBLGen%20Pro%20RTF/hh_goto.htm#Usi
ng%20the generated code/Linq/gencode_linq_functionmappings.htm
Maps everything in all situations.
I won't provide a patch for you at this moment, as I have no
knowledge of the nh linq provider internals, and that will take a lot of
time to get into, but maybe someone on the 'team' could use this to get
things forward.
Writing a linq provider takes a lot of dedication and focus, you
can't just 'jump in and provide a patch', that's not going anywhere. You
have to design the thing from start to finish, write tools to help you
develop the thing (like visualizers which view the various stages of the
expression tree), and implement the various stages. You however seem to
think it's a matter of 'someone will come up with a patch for <feature> /
<bug>'. No way in hell that that's going to work for the linq provider.
Frans, if you want to understand the dynamics of development here you'll have to take a look at the code. Writing a LINQ provider based on re-linq is a very different endeavor than starting from scratch. You did it the Matt Warren way, basically, right? With re-linq, you get a nice AST and some tools. No IQueryable-based expression tree, no transparent identifiers, etc.
And I understand that HQL is closer to LINQ than SQL is to LINQ, so some transformations are simply not necessary. I think the hardest part is designing how a certain LINQ expression should be translated to HQL - that's why I keep insisting on the HQL output for diagnostics. (Now that's just theory, and I'm sure Steve can tell us about some monumental problems he had to solve, but I still think that you can't judge the accessibility of that code from your own experience. The parts of the LINQ2NH code that I looked at don't look that frightening after all, and that's a good thing!)
Long story short, I believe that once you understand what transformation the code is trying to achieve, any good coder should be able to create a little patch.
Long story short, I believe that once you understand what transformation the code is trying to achieve, any good coder should be able to create a little patch.
yep
> With re-linq, you get a nice AST and some
> tools. No IQueryable-based expression tree, no transparent identifiers,
etc.
that's nice :) I hope that it solves the problem of encapsulated
sources (where a source in a join is converted into a property of an
anonymous type, which is accessed in another join, which is then again made
a property of an anonymous type. By that time, the original source is lost,
you have to track it in a tracker to be able to assign the right aliases, as
the complete join is made 1 join list in the output. With a system which
makes this possible without hassle, a LOT is gained. Not sure if re-linq
does that though (haven't looked at it yet)
> And I understand that HQL is closer to LINQ than SQL is to LINQ, so some
> transformations are simply not necessary. I think the hardest part is
> designing how a certain LINQ expression should be translated to HQL -
that's
> why I keep insisting on the HQL output for diagnostics. (Now that's just
> theory, and I'm sure Steve can tell us about some monumental problems he
had
> to solve, but I still think that you can't judge the accessibility of that
> code from your own experience. The parts of the LINQ2NH code that I looked
> at don't look that frightening after all, and that's a good thing!)
tools which visualize the tree's state on the various stages during
transformation is essential, as it can quickly show you where you did or
didn't do the right thing.
But even with an AST, a lot of problems still remain. For example
the query folding with group by (as group by is outside the query it works
on in linq, in SQL it's part of the same query), multi-aggregate queries
which require query folding (query becomes derived table/subquery of
subsequential aggregate's source, with value passing in projection)... these
problems are still on your plate (to name a few). unless you've solved these
as well (which would be great :))
> Long story short, I believe that once you understand what transformation
the
> code is trying to achieve, any good coder should be able to create a
little
> patch.=
Yes, if one understands the code, creating a patch isn't that hard.
Getting there however is something else. I simply don't believe stories
where people say they understand just 'a part' of a linq provider and can
make proper decisions about where to change which code to add a feature.
Sure, the easy stuff, like a from + a where and an entity returning select,
or like the many 'full' (read: fall flat on your face if you do something
complex) linq providers out there which simply implement the IQueryable
extension methods and be done with that, that's doable, one could oversee
the consequences when something is changed, as it's not that complex yet. it
gets very complex very quickly after that.
I.o.w.: if you don't have a design which says what you're doing
where, things get too complex to manage as it's not doable to dive in and
focus on something in particular, fix that and move on. Take group join.
(join ... into.. ). It's simple at first, you simply pick one side (left
side) and ignore the other. Till there's a DefaultIfEmpty. Then you have to
pull the OTHER side you ignored till then, at the spot of the DefaultIfEmpty
and change the join the DefaultIfEmpty is part of in a left join.
That's not 'some patch', that's a lot of work to get that right, in
all situations and it affects multiple stages in the transformation, so
design of the feature, then decisions where to make the changes etc.
'Creating a patch' is not going to work in this case (or in many other cases
with respect to the linq provider).
I must say I'm a little surprised that apparently on this list
people think it is simply a matter of waiting for the right patch to come
along.
FB
I just think that someone who has built a complete provider himself may have a wrong impression as to how hard it would be to get into NH's LINQ code, that's why I explained its relationship to re-linq. Right now, L2NH has about 4500 LoC + about 9000 LoC in unit tests. Compare that to DbLinq or your own. Or to re-linq for that matter, which has > 13 KLoC in the frontend (used by NH), 10 KLoC in the SQL generation backend (not used by NH), and 20KLoC in unit tests for frontend and backend each. And the fun only begins when you actually look at the complexity of each piece of code. Our intention was to keep the hardest stuff in the lowest layers, because they have a higher chance of being reused. (E.g., in our first version, we had backtracking of transparent IDs in the back end, which would have meant that every backend would have to handle that.)
The first part of Steve's project, moving HQL from string-based to AST-based, might have been the harder part for all I know.
As for your detailed questions... well... I understand re-linq's architecture when Fabian explains it to me ;-) But I better let him fill in the details.
Cheers,
Stefan
> -----Original Message-----
> From: nhibernate-...@googlegroups.com [mailto:nhibernate-
> devel...@googlegroups.com] On Behalf Of Frans Bouma
> Sent: Thursday, July 29, 2010 11:06 AM
> To: nhibernate-...@googlegroups.com
> Subject: RE: [nhibernate-development] NH-2254
>
oh I agree. Similar to the thing we saw last week with the
skip/take/count stuff.
> I just think that someone who has built a complete provider himself may
have
> a wrong impression as to how hard it would be to get into NH's LINQ code,
> that's why I explained its relationship to re-linq. Right now, L2NH has
> about 4500 LoC + about 9000 LoC in unit tests. Compare that to DbLinq or
> your own. Or to re-linq for that matter, which has > 13 KLoC in the
frontend
> (used by NH), 10 KLoC in the SQL generation backend (not used by NH), and
> 20KLoC in unit tests for frontend and backend each. And the fun only
begins
> when you actually look at the complexity of each piece of code.
LoC measured in ndepend (so true LoC) or from sourcefiles? My linq
provider has in ndepend 6500 LoC. In sourcecode, it's a multiple of that,
but I'm very verbose haha :D
But kidding aside, I see the advantage re-linq brings, no question
about that, I just wonder (and still do) where the complex linq stuff is
solved: in re-linq or does a user of re-linq have to solve these probs?
I'll have a look at re-linq to see what kind of trees it produces
with some of the 'pain' linq queries like: (adventure works)
var q = from reason in metaData.Reason
join time in
(
from timeHeader in
(
from timeHeader in
(
from
timeHeader in metaData.NonPresentTimeHeader.Where(header => header.Id == 6)
join
userDetail in metaData.UserDetail on timeHeader.UserId equals userDetail.Id
into userJoin
from user in
userJoin.DefaultIfEmpty()
select
timeHeader
)
join userDetail in
metaData.UserDetail on timeHeader.ApprovedFromId equals userDetail.Id into
approverJoin
from approver in
approverJoin.DefaultIfEmpty()
select timeHeader
)
join time in metaData.NonPresentTime on
timeHeader.Id equals time.HeaderId into timeJoin
from joinedTime in timeJoin.DefaultIfEmpty()
select joinedTime
) on reason.Id equals time.ReasonId into reasonJoin
from joinedReason in reasonJoin.DefaultIfEmpty()
select new
{
ReasonID = reason.Id,
Reason = reason.Reason,
HeaderID = joinedReason.HeaderId ?? -1,
TimeID = joinedReason.Id,
Notes = joinedReason.Notes,
DateStart = joinedReason.DateStart,
DateEnd = joinedReason.DateEnd
};
it still pains me to see this one fail in my linq provider (among
several other 'headache' queries), but then again, it's not a common query
;) (it's a reproduction of a problem with a real-life query, so it doesn't
make sense, but illustrates a couple of nasty issues)
the above misery query has several complex problems combined which
make it problematic to work with. What would be great if a pre-processor
would solve these, so transformation is easier (to sql): that elements are
at the right spot for discovery for transformation, so the provider doesn't
have to hunt down sources for particular properties, can work with scopes
easily so subtree references are not crossing scopes for alias assignment
etc.
If re-linq can solve that, it would indeed be rather 'easy' as in:
way easier than creating it using the 'warren' method with expression
objects which are converted into other expression objects etc. which leads
to painful conversions.
FB
Which was a problem reported for the old contrib provider. A coworker just pointed that out, in the heat of the discussion nobody seemed to notice ;-)
> LoC measured in ndepend (so true LoC) or from sourcefiles? My
> linq
> provider has in ndepend 6500 LoC. In sourcecode, it's a multiple of
> that,
> but I'm very verbose haha :D
Source files, counting comments and license headers too. Grain of salt recommended.
> But kidding aside, I see the advantage re-linq brings, no
> question
> about that, I just wonder (and still do) where the complex linq stuff
> is
> solved: in re-linq or does a user of re-linq have to solve these probs?
>
> I'll have a look at re-linq to see what kind of trees it produces
> with some of the 'pain' linq queries like: (adventure works)
(insulting query removed)
In the frontend, I'm pretty sure we handle that. You'll get a nice query model with neatly separated subqueries.
As for the backend: With a quick look I can't see anything that shouldn't work. (Fabian may.) But we'd have to try, this query was built to destroy LINQ providers after all ;-)
I suggest you take that question to re-moti...@googlegroups.com. I don't want to hijack this list here.
> it still pains me to see this one fail in my linq provider (among
> several other 'headache' queries), but then again, it's not a common
> query
> ;) (it's a reproduction of a problem with a real-life query, so it
> doesn't
> make sense, but illustrates a couple of nasty issues)
I can't see these issues, just an annoying level of subquery nesting, which re-linq should handle gracefully. But maybe I'm missing something. Ask the man!
(The frontend will not resolve the DefaultIfEmtpy to a left join though! I think I discussed that with Fabian once, but my memory fails me.)
> If re-linq can solve that, it would indeed be rather 'easy' as
> in:
> way easier than creating it using the 'warren' method with expression
> objects which are converted into other expression objects etc. which
> leads
> to painful conversions.
That's what we're trying!