Hi,
My re-linq backend has been doing a lot for me (thanks, btw). But I am now in a situation where I’d really like to be able to write something like the following:
IQueryable<recoTree> r1, r2;
r1.Concat(r2).Plot(…);
My code is quite happy to do “r1.Plot(…)” and “r2.Plot(…)”. Of course, for many reasons, really doing the Concat there would not work generally.
However, in my case, I know how to add the result of r1.Plot() to r2.Plot(). So, I’d like to make, I suppose, the transform “r1.Concat(r2).Plot(…)” => “r1.Plot(…) + r2.Plot(…)”
I’m just staring to think about how to approach this, or how hard it would be in the re-linq framework. The main issue that I’m stumbling on is I’m turning what looks like one query and processing cycle in re-linq into two.
As a backup I could surface that “addition” in all places, there it will require a major change in my approach to using linq, and also it would end up looking very non-linq’y. 😊
Advice, or comments, or ideas on how to achieve this pseudo-Concat operator in re-linq?
Many thanks!
Cheers,
Gordon.
Hi Michael,
Thanks a lot for thinking about this. What you said might help, but I’m not sure yet. 😊 Let me try a longer explanation of what I’m thinking. And along the way try to answer some of the questions in your email.
My re-linq provider is attached to a file. So all the data from that file is represented by one IQueryable<>. Now, I have multiple files, and I would like to stich them together into a single sequence, and send that sequence to my “Plot” result operator.
I have a functioning library based on re-linq that will allow me to send the sequence into the Plot result operator.
Re-linq can’t concat arbitrary sequences from different IQuerable data sources for, I think, fairly obvious technical reasons. In my case, the LINQ sequence is converted to C++, and then run in custom analysis software against the data in the file. At some level, combining the files makes no sense.
However, combining the results makes a lot of sense. In this case it is a plot, and I can just “Add”. However, in other cases, it might be something else – like an integer, or a special kind of plot. Whatever, I, as the end user, have to provide the addition semantics.
So perhaps the proper way to represent is to change from Plot (Concat(r1, r2)) to Add(Plot(r1), Plot(r2)).
Now, your suggestion of doing PlotQUeriable.Concat(PlotQUeriable). I’d be happy to handle that – but I’m not sure how to do it in the relinq infrastructure. Re-linq currently fails with a complaint it doesn’t know how to do the method call Concat (which makes sense). But if I extend re-linq I’m not really sure how I would handle that. This is because Plot(Concat(r1, r2)) looks to relinq like a single query, and I need to transform it into a two queries: Plot(r1) and Plot(r2), and then add the results.
Just to make this more interesting… Actually running over two files is currently supported by my provider. I just hand the executor a list of files. What I want to do here is take the sequence r1, add some meta data to it, do the same for r2, and then run Plot on it – and the result of Plot would depend on that. Specifically, the data in the two files have different “weights”, so each bit of data in r1 is worth twice the bit of data in r2. Some pseudo code:
IQueriable<recoTree> r1, r2 = ….;
var r1w = r1.Select(r => new { Data = r.value, Weight = 1.0 });
var r2w = r2.Select(r => new { Data = r.value, Weight = 0.05 });
r1w.Concat(r2w).Plot(…, s => s.Data, s => s.Weight).SaveToFile(myfile);
I can write currently, and it builds and runs:
IQueriable<recoTree> r1, r2 = ….;
var r1w = r1.Select(r => new { Data = r.value, Weight = 1.0 });
var r2w = r2.Select(r => new { Data = r.value, Weight = 0.05 });
var p1 = r1w.Plot(…, s => s.Data, s => s.Weight);
var p2 = r1w.Plot(…, s => s.Data, s => s.Weight);
p1.Add(p2).SaveToFile(myfile);
Being able to pass the Concat’ sequence into my more complex methods and routines, without having to surface that addition everywhere, would make the code significantly cleaner. So I want to be able to write the first block of code instead of the second block.
You mention code the detects a fairly specific start and end point. I have no problem authoring that code in my re-linq backend – I already do some fairly complex transformations (I support tuples, anonymous classes, real classes – none of which actually makes it into the C++ code, along with bridging the .NET and C++ world). But, I just can’t conceptually figure out where to intercept the query in the re-linq processing pipe-line to scan for Concat’s, and then split it into multiple queries, which I then take the results form and add.
The best I could come up with was I needed to write a new re-linq provider. I’d write a new Concat function (“Stitch” 😊) that would then perform two different queries and add them together. That is a fair amount of work, so I thought I’d see if there was another part of the pipeline I should be thinking about.
I hope this makes what I want to do more clear.
Cheers,
Gordon
Hmmm… I don’t think so, because I want to be able to do things like this:
var r1w = r1.Select(r => new { Data = r.value, Weight = 1.0 });
var r2w = r2.Select(r => new { Data = r.value, Weight = 0.05 });
var r = r1w.Concat(r2w);
r.Where(m => m.value > 2.0).Plot(…, s => s.Data, s => s.Weight).SaveToFile(myfile);
In short – I want to apply a Where or other transformations like Select, etc., to the stream.
Also, doing something like “r1.AsEnumerable()” will generate a sequence in memory, as you say, however, that would be a bit of a disaster for a few reasons. First, some of these data streams are many GBs right now (and will increase), and second, the back end is distributed and remote, so pulling the data back over the network may not be very fast. However, the “.Plot” guy generates a very small bit of data (a binned histogram, a 20-40 KB) from 100’s of GB of data.
Am I making sense?
Cheers,
Gordon
To unsubscribe from this group and stop receiving emails from it, send an email to re-motion-users+unsubscribe@googlegroups.com.
To post to this group, send email to re-motion-users@googlegroups.com.
Visit this group at https://groups.google.com/group/re-motion-users.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "re-motion Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to re-motion-users+unsubscribe@googlegroups.com.
To post to this group, send email to re-motion-users@googlegroups.com.
Visit this group at https://groups.google.com/group/re-motion-users.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "re-motion Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to re-motion-users+unsubscribe@googlegroups.com.
To post to this group, send email to re-motion-users@googlegroups.com.
Hi,
Clever!
Yes, you can always look at them as split queries – so you can do exactly as you suggest – at each concat, drop one half of the query, then the other, and then add the two together.
Creating a new method with the same semantics as Concat was something I’d thought of as well, and it is easy to capture that during re-linq processing (I do it with lots of other things in my provider, actually). But I’d not thought to make that a result operator. Especially because I’d wanted re-linq to continue process the query. I’d not thought about making it a result operator. But that sounds like it would do what I want.
But… I’m having trouble with the implied code behind this statement:
Ø This should leave you with two QueryModels that represent the queries as if you had written them separately
Once I have the two Query Models, in the middle of my processing, how do I get all the re-linq power to actually process them? Is it as “simple” as calling my query executor with the new QM, or is there some context I need to get setup?
Many thanks for your help!!
Cheers,
Gordon
P.S. You note that you switched back to the concat semantics in your last paragraph. Was there a reason for that? If there was, I missed the significance.
Hi,
Clever!
Hi,
Ha! Ok – I think I’m making this harder than it should be: I know a few corners of re-linq well, and, obviously not the others.
I have not thought a great deal about how to run a QM through my code to turn it into C++ code. But I use your complete infrastructure of QM visitors and expression visitors, etc., in order to effect that. But I think I copied from some example project how I start from a QM at the top (that re-linq hands to me). I’ve obviously forgotten about some code I wrote at the very start of this project (when, frankly, I probably didn’t know enough to understand the role of the code I was copying).
Ok, I’ve got enough to try this out. I will probably tackle this later this week and I will post back with results or questions.
Again, thanks for your help!
Hi,
I’ve written a QueryVisitor that walks through the QM and splits it. There are some special cases I haven’t tested yet, and the code needs some cleaning (I re-factored at the wrong point). But these will get fixed. The basic idea is there. And it is much shorter than I’d thought it was going to be.
If you feel up to it, you can take a quick look and see if what I’m doing basically matches what you had in mind:
https://github.com/gordonwatts/LINQtoROOT/commit/add1fb31a8fcd4faeb0974839ec73cb8b68ceff5
Look for the ConcatSplitterQueryVisitor.cs file to see the actual QM visiting and splitting code. Just above is the unit test file that drives it. Comments, obviously, welcome. I’ve updated and modified expressions in QueryModels, but I’ve never actually changed the structure of a QM before. Comments on style as well as correctness are definitely welcome! 😊
My next step is to integrate it into the execution environment. That will be the ultimate test to see if I got a Query reference source out of place. But that is for later this week – as I have quite a bit of infrastructure around that and so I must proceed a little more carefully (it will also not be interesting to anyone on this list).
Cheers,
Gordon
From: Michael Ketting
Sent: Tuesday, February 23, 2016 9:06 AM
To: re-motion Users
Cc: fabian....@gmail.com
Subject: Re: [re-motion-users] Re: Using Concat
--
You received this message because you are subscribed to the Google Groups "re-motion Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to re-motion-users+unsubscribe@googlegroups.com.
To post to this group, send email to re-motion-users@googlegroups.com.
Visit this group at https://groups.google.com/group/re-motion-users.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "re-motion Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to re-motion-users+unsubscribe@googlegroups.com.
To post to this group, send email to re-motion-users@googlegroups.com.
Visit this group at https://groups.google.com/group/re-motion-users.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "re-motion Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to re-motion-users+unsubscribe@googlegroups.com.
To post to this group, send email to re-motion-users@googlegroups.com.
Visit this group at https://groups.google.com/group/re-motion-users.
For more options, visit https://groups.google.com/d/optout.
And it is much shorter than I’d thought it was going to be.
Thanks!
By then it will have been cleaned and updated as I better understand the use cases I have to handle, so after clicking on that link, go to the actual file, rather than that change set.
And no worries. Weekends are meant for enjoyment, not looking at stranger’s code! 😊
Cheers,
Gordon
From: Michael Ketting
Sent: Wednesday, February 24, 2016 8:13 AM
To: re-motion Users
Subject: Re: [re-motion-users] Re: Using Concat
And it is much shorter than I’d thought it was going to be.