Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
Message from discussion Efficient way to groupBy and filter

Date: Sat, 10 Nov 2012 17:03:43 -0800 (PST)
From: jstorm <infected...@gmail.com>
To: gremlin-users@googlegroups.com
Message-Id: <4e2bb987-d7b6-4c2f-8e03-34c5049e2ff0@googlegroups.com>
In-Reply-To: <CAA-H4394hFUjq1LC42bdLSyLzmE2ecdLud=KCd2Jo53-++ukgg@mail.gmail.com>
References: <85437ca9-8a02-439d-bb53-120d8bd93192@googlegroups.com>
 <19dee1d1-fbd9-494a-84bc-d989b8ac2830@googlegroups.com>
 <CAA-H4394hFUjq1LC42bdLSyLzmE2ecdLud=KCd2Jo53-++ukgg@mail.gmail.com>
Subject: Re: [TinkerPop] Re: Efficient way to groupBy and filter
MIME-Version: 1.0
Content-Type: multipart/mixed; 
	boundary="----=_Part_7_28791508.1352595823402"

------=_Part_7_28791508.1352595823402
Content-Type: multipart/alternative; 
	boundary="----=_Part_8_10412622.1352595823403"

------=_Part_8_10412622.1352595823403
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit

Thanks for your help Stephen :)
 
Point taken. I am actually learning groovy and gremlin (but have no 
experience in Java at all) and trying to apply it to problems I am trying 
to solve, but perhaps I have bitten off a bit more than I can chew at the 
moment.
 
I am currently going through Groovy in Action (excellent book by the way, 
for those who want to get started with groovy but have not Java 
experience). Previously I was reading Programming Groovy (pragmatic 
programmers), but I felt that there is too much emphasis on knowing Java 
(Groovy in Action is much more neutral)).
 
I do wish that there is a book for gremlin though (perhaps one of you guys 
could write one :) ). Currently I am relying on the wiki on Github and 
gremlindocs.com, but for a beginner, there isn't really a way to ease 
myself into gremlin so to speak.
 
Cheers :)

On Saturday, November 10, 2012 1:58:48 AM UTC+11, Stephen Mallette wrote:

> You can filter in the reduce closure pretty easily so that you don't 
> have to post-process the outputted map from the groupBy: 
>
> gremlin> g.V.out.groupBy{it.name}{it.in}{it.unique().findAll{i -> 
> i.age > 30}.name}.cap 
> ==>{lop=[josh, peter], ripple=[josh], josh=[], vadas=[]} 
>
> In this way you can evaluate each item extracted into the grouped 
> value of the map.  I guess you could do some filtering in the value 
> closure as well: 
>
> gremlin> g.V.out.groupBy{it.name}{it.in.filter{i -> i.age > 
> 30}.name}{it.toList().unique()}.cap 
> ==>{lop=[josh, peter], ripple=[josh], josh=[], vadas=[]} 
>
> I think you should spend some time understanding the underpinnings of 
> groovy/closures a bit and how it relates back to the gremlin.  It 
> sounds like you're attempting to do some less than trivial things in 
> your work.  Consider taking some time away from these very complex 
> problems and focus on just getting things to work.  Once working, then 
> you can refine and improve.  Just a suggestion :) 
>
> Stephen 
>
> On Thu, Nov 8, 2012 at 8:41 PM, jstorm <infec...@gmail.com <javascript:>> 
> wrote: 
> > P.S. I understand that 1 possible solution is to filter out the vertices 
> > that do not match datetime requirements before we do the group by, but 
> let's 
> > say I have the following: 
> > 
> > date: 11 November 9:00 PM 
> > object: v[1] 
> > 
> > date: 15 November 8:00 PM 
> > object: v[1] 
> > 
> > Let's say I only want events before 12 November 9:00 PM. 
> > 
> > The above 2 will be grouped together, but because one of the vertices 
> falls 
> > outside the time range, so all vertices that can be grouped due to v[1] 
> > should be discarded and not grouped. If I do the filtering before the 
> > grouping, then I am unable to implement this. 
> > 
> > Any input appreciated :) 
> > 
> > 
> > On Friday, November 9, 2012 12:11:53 PM UTC+11, jstorm wrote: 
> >> 
> >> I have a few simple vertices with some properties I would like to 
> groupBy 
> >> on. 
> >> 
> >> These vertices also have a property called date which contains a date 
> >> generated using new Date(). 
> >> 
> >> What I would like to do is to groupBy on a property called target and 
> if 
> >> the date of any of the vertices that we are grouping on does not meet a 
> >> certain condition, we do not to the grouping for them at all and they 
> will 
> >> not appear in the final grouped results. 
> >> 
> >> Looking at the wiki for groupBy, I can use the key and value functions 
> to 
> >> groupBy and to determine what will get pushed to the final collection 
> >> outputted by groupBy. 
> >> 
> >> Besides iterating through the output of groupBy and removing results 
> where 
> >> the child vertices do not match the date requirements (i think this 
> could be 
> >> quite expensive), is there a way to do this filtering during the 
> groupBy 
> >> process? 
> >> 
> >> Cheers :) 
> > 
> > -- 
> > 
> > 
>

------=_Part_8_10412622.1352595823403
Content-Type: text/html; charset=utf-8
Content-Transfer-Encoding: quoted-printable

<div>Thanks for your help Stephen :)</div><div>&nbsp;</div><div>Point taken=
. I am actually learning groovy and gremlin (but have no experience in Java=
 at all)&nbsp;and trying to apply it to problems I am trying to solve, but =
perhaps I have bitten off a bit more than I can chew at the moment.</div><d=
iv>&nbsp;</div><div>I am currently going through Groovy in Action (excellen=
t book by the way, for those who want to get started with groovy but have n=
ot Java experience). Previously I was reading Programming Groovy (pragmatic=
 programmers), but I felt that there is too much emphasis on knowing Java (=
Groovy in Action is much more neutral)).</div><div>&nbsp;</div><div>I do wi=
sh that there is a book for gremlin though (perhaps one of you guys could w=
rite one :) ). Currently I am relying on the wiki on Github and gremlindocs=
.com, but for a beginner, there isn't really a way to ease myself into grem=
lin so to speak.</div><div>&nbsp;</div><div>Cheers :)</div><div><br>On Satu=
rday, November 10, 2012 1:58:48 AM UTC+11, Stephen Mallette wrote:</div><bl=
ockquote style=3D"margin: 0px 0px 0px 0.8ex; padding-left: 1ex; border-left=
-color: rgb(204, 204, 204); border-left-width: 1px; border-left-style: soli=
d;" class=3D"gmail_quote">You can filter in the reduce closure pretty easil=
y so that you don't
<br>have to post-process the outputted map from the groupBy:
<br>
<br>gremlin&gt; g.V.out.groupBy{<a href=3D"http://it.name" target=3D"_blank=
">it.name</a>}{<a href=3D"http://it.in" target=3D"_blank">it.in</a><wbr>}{i=
t.unique().findAll{i -&gt;
<br>i.age &gt; 30}.name}.cap
<br>=3D=3D&gt;{lop=3D[josh, peter], ripple=3D[josh], josh=3D[], vadas=3D[]}
<br>
<br>In this way you can evaluate each item extracted into the grouped
<br>value of the map. &nbsp;I guess you could do some filtering in the valu=
e
<br>closure as well:
<br>
<br>gremlin&gt; g.V.out.groupBy{<a href=3D"http://it.name" target=3D"_blank=
">it.name</a>}{it.<wbr>in.filter{i -&gt; i.age &gt;
<br>30}.name}{it.toList().unique()<wbr>}.cap
<br>=3D=3D&gt;{lop=3D[josh, peter], ripple=3D[josh], josh=3D[], vadas=3D[]}
<br>
<br>I think you should spend some time understanding the underpinnings of
<br>groovy/closures a bit and how it relates back to the gremlin. &nbsp;It
<br>sounds like you're attempting to do some less than trivial things in
<br>your work. &nbsp;Consider taking some time away from these very complex
<br>problems and focus on just getting things to work. &nbsp;Once working, =
then
<br>you can refine and improve. &nbsp;Just a suggestion :)
<br>
<br>Stephen
<br>
<br>On Thu, Nov 8, 2012 at 8:41 PM, jstorm &lt;<a href=3D"javascript:" targ=
et=3D"_blank" gdf-obfuscated-mailto=3D"-ohwRM5PDrkJ">infec...@gmail.com</a>=
&gt; wrote:
<br>&gt; P.S. I understand that 1 possible solution is to filter out the ve=
rtices
<br>&gt; that do not match datetime requirements before we do the group by,=
 but let's
<br>&gt; say I have the following:
<br>&gt;
<br>&gt; date: 11 November 9:00 PM
<br>&gt; object: v[1]
<br>&gt;
<br>&gt; date: 15 November 8:00 PM
<br>&gt; object: v[1]
<br>&gt;
<br>&gt; Let's say I only want events before 12 November 9:00 PM.
<br>&gt;
<br>&gt; The above 2 will be grouped together, but because one of the verti=
ces falls
<br>&gt; outside the time range, so all vertices that can be grouped due to=
 v[1]
<br>&gt; should be discarded and not grouped. If I do the filtering before =
the
<br>&gt; grouping, then I am unable to implement this.
<br>&gt;
<br>&gt; Any input appreciated :)
<br>&gt;
<br>&gt;
<br>&gt; On Friday, November 9, 2012 12:11:53 PM UTC+11, jstorm wrote:
<br>&gt;&gt;
<br>&gt;&gt; I have a few simple vertices with some properties I would like=
 to groupBy
<br>&gt;&gt; on.
<br>&gt;&gt;
<br>&gt;&gt; These vertices also have a property called date which contains=
 a date
<br>&gt;&gt; generated using new Date().
<br>&gt;&gt;
<br>&gt;&gt; What I would like to do is to groupBy on a property called tar=
get and if
<br>&gt;&gt; the date of any of the vertices that we are grouping on does n=
ot meet a
<br>&gt;&gt; certain condition, we do not to the grouping for them at all a=
nd they will
<br>&gt;&gt; not appear in the final grouped results.
<br>&gt;&gt;
<br>&gt;&gt; Looking at the wiki for groupBy, I can use the key and value f=
unctions to
<br>&gt;&gt; groupBy and to determine what will get pushed to the final col=
lection
<br>&gt;&gt; outputted by groupBy.
<br>&gt;&gt;
<br>&gt;&gt; Besides iterating through the output of groupBy and removing r=
esults where
<br>&gt;&gt; the child vertices do not match the date requirements (i think=
 this could be
<br>&gt;&gt; quite expensive), is there a way to do this filtering during t=
he groupBy
<br>&gt;&gt; process?
<br>&gt;&gt;
<br>&gt;&gt; Cheers :)
<br>&gt;
<br>&gt; --
<br>&gt;
<br>&gt;
<br></blockquote>
------=_Part_8_10412622.1352595823403--

------=_Part_7_28791508.1352595823402--