one "TINY" question about defining a "BAG"

19 views
Skip to first unread message

BgFu

unread,
Jan 4, 2012, 2:00:21 AM1/4/12
to DataFu

Hi,

I just started working on a few things and one question came up while
dealing with operations of bags:

1. A Tuple can be with one int or two. Can a bag contain two kinds of
tuples? In other words, {(1), (3), (2, 5)}??? Or is it always going
to be either {(1), (3), (5)} or {(2, 3), (1, 5), (2, 1)}? I know,
from the example in the file, I should assume that (1) and (1,5) can't
be in the same bag, but this was bugging me since it matters in the
implementation, so I thought I should ask anyway.

I just started programming in Java not too long ago, so I'm pretty
much playing simply by building java classes of DataFu without using
any pig:-). I was able to hack a few separately (I think:-) and just
started working on the PageRank. Anyone working on "PageRank?"

Matt Hayes

unread,
Jan 4, 2012, 12:32:06 PM1/4/12
to dat...@googlegroups.com
Thanks for your interest in DataFu :)

To answer your question, a bag can only contain one kind of tuple.

If you're interested in PageRank you can try running the pig tests:

ant test -Dtestclasses.pattern=**/PageRankTests.class

The test uses the Pig script test/pig/datafu/test/pig/linkanalysis/pageRankTest.pig and uses the graph from the PageRank page on Wikipedia: http://en.wikipedia.org/wiki/PageRank .  You can also play around with the PageRank class directly without using Pig: src/java/datafu/linkanalysis/PageRank.java.

-Matt

BgFu

unread,
Jan 6, 2012, 1:05:25 AM1/6/12
to DataFu

Thanks for the response! Yes, I'm following the pig version and the
graph concept to understand how the rank is computed and to implement
it in my java version. Probably very unnecessary since there is
already a java version, but it provides me a playground to work with
java meaningfully! So, thanks for that!



On Jan 4, 11:32 am, Matt Hayes <matthew.terence.ha...@gmail.com>
wrote:
> Thanks for your interest in DataFu :)
>
> To answer your question, a bag can only contain one kind of tuple.
>
> If you're interested in PageRank you can try running the pig tests:
>
> ant test -Dtestclasses.pattern=**/PageRankTests.class
>
> The test uses the Pig
> script test/pig/datafu/test/pig/linkanalysis/pageRankTest.pig and uses the
> graph from the PageRank page on Wikipedia:http://en.wikipedia.org/wiki/PageRank.  You can also play around with the
Reply all
Reply to author
Forward
0 new messages