Groups
Conversations
All groups and messages
Send feedback to Google
Help
Training
Sign in
Groups
DataFu
Conversations
About
Groups keyboard shortcuts have been updated
Dismiss
See shortcuts
DataFu
Contact owners and managers
1–22 of 22
DataFu has been accepted into Apache Incubator. The new project page can be found at
http://datafu.incubator.
apache.org/
.
Please direct new questions to the dev mailing list listed at
http://incubator.apache.org/
projects/datafu.html
.
This group will be retained to keep a record of previous questions.
Mark all as read
Report group
0 selected
Aaron Josephs
,
Matthew Hayes
7
7/22/14
Issue running hourglass on CDH4
Can you email these details to the apache mailing list? On Tue, Jul 22, 2014 at 12:26 PM, Aaron
unread,
Issue running hourglass on CDH4
Can you email these details to the apache mailing list? On Tue, Jul 22, 2014 at 12:26 PM, Aaron
7/22/14
Sai SaiGraph
6/23/14
Sample impressions file.
Hi Just trying to learn datafu samples and i am trying to practise this example: https://datafu.
unread,
Sample impressions file.
Hi Just trying to learn datafu samples and i am trying to practise this example: https://datafu.
6/23/14
Jason Bodnar
,
Matthew Hayes
2
2/25/14
Combining job that runs more frequently than daily
MapReduce jobs in general may not be the right choice if you are looking at low latency updates like
unread,
Combining job that runs more frequently than daily
MapReduce jobs in general may not be the right choice if you are looking at low latency updates like
2/25/14
Abhishek Gayakwad
,
Matthew Hayes
2
2/12/14
Mapping output of Hourglss jobs to hive tables
The jobs have methods getOutputSchemaName() and getOutputSchemaNamespace() that can be overridden. By
unread,
Mapping output of Hourglss jobs to hive tables
The jobs have methods getOutputSchemaName() and getOutputSchemaNamespace() that can be overridden. By
2/12/14
Abhishek Gayakwad
,
Matthew Hayes
2
2/5/14
Hourglass Input paths
Hi Abhishek, At the moment the input directory structure is fixed to the yyyy/mm/dd format. This is
unread,
Hourglass Input paths
Hi Abhishek, At the moment the input directory structure is fixed to the yyyy/mm/dd format. This is
2/5/14
Abhishek Gayakwad
2/3/14
why datafu hourglass has hard dependency on avro format ? <eom>
unread,
why datafu hourglass has hard dependency on avro format ? <eom>
2/3/14
Adrian Landman
,
Matthew Hayes
4
1/15/14
Negative variance...
I see, strange that you'd have heap issues with min and max together. So the code you just posted
unread,
Negative variance...
I see, strange that you'd have heap issues with min and max together. So the code you just posted
1/15/14
Rizwana Rizia
, …
Matthew Hayes
4
11/13/13
Compute percentile
I gave a talk where I walked through this type of scenario. Check out the slides here: http://www.
unread,
Compute percentile
I gave a talk where I walked through this type of scenario. Check out the slides here: http://www.
11/13/13
Jarek Jarcec Cecho
,
Matthew Hayes
9
11/6/13
Doing official release with Pig 0.12.0 support?
Looks great, thank you! Jarcec On Wed, Nov 06, 2013 at 05:50:14PM -0800, Matthew Hayes wrote: >
unread,
Doing official release with Pig 0.12.0 support?
Looks great, thank you! Jarcec On Wed, Nov 06, 2013 at 05:50:14PM -0800, Matthew Hayes wrote: >
11/6/13
Faraz Rasheed
,
Matt Hayes
3
7/3/13
Sessionize() giving Unexpected internal error. Expected input bag to contain a TUPLE, but instead found chararray
Thanks Matt, this is what I thought as well that probably I was using an older version of pig, thanks
unread,
Sessionize() giving Unexpected internal error. Expected input bag to contain a TUPLE, but instead found chararray
Thanks Matt, this is what I thought as well that probably I was using an older version of pig, thanks
7/3/13
Sajid Raza
,
Matthew Hayes
3
6/23/13
Datafu Branches on Github
Branches have been created. -Matt On Fri, Jun 21, 2013 at 9:24 AM, Matthew Hayes <matthew.terence.
unread,
Datafu Branches on Github
Branches have been created. -Matt On Fri, Jun 21, 2013 at 9:24 AM, Matthew Hayes <matthew.terence.
6/23/13
Mike Sukmanowsky
,
Matthew Hayes
2
1/10/13
Sessionize spills records to disk
Hi Mike, What is the failure you are getting? Out of memory? Do you know the upper bound on how many
unread,
Sessionize spills records to disk
Hi Mike, What is the failure you are getting? Out of memory? Do you know the upper bound on how many
1/10/13
Josh Rosenberg
,
Sam Shah
2
1/7/13
SetUnion does not handle large inputs gracefully
Josh, you are absolutely correct. We've merged in a patch: https://github.com/linkedin/datafu/
unread,
SetUnion does not handle large inputs gracefully
Josh, you are absolutely correct. We've merged in a patch: https://github.com/linkedin/datafu/
1/7/13
Johan Gustavsson
,
Matthew Hayes
7
11/14/12
Trying to use MarkovPairs but keep getting errors
Sorry for the late reply, I'm not sure why it wouldn't work, in fact I haven't been able
unread,
Trying to use MarkovPairs but keep getting errors
Sorry for the late reply, I'm not sure why it wouldn't work, in fact I haven't been able
11/14/12
Ryan Michael
9/17/12
Can someone give me an example of ApplyQuantile use?
I've been hitting my head against a wall for a good 4 hours trying to get ApplyQuantile to work.
unread,
Can someone give me an example of ApplyQuantile use?
I've been hitting my head against a wall for a good 4 hours trying to get ApplyQuantile to work.
9/17/12
Johnny Zhang
,
Matthew Hayes
2
8/29/12
run Datafu unit test in JUnit framework
Hi Johnny, I don't know of a way to run the TestNG tests using JUnit. Maybe it wasn't the
unread,
run Datafu unit test in JUnit framework
Hi Johnny, I don't know of a way to run the TestNG tests using JUnit. Maybe it wasn't the
8/29/12
Amit
,
Sam Shah
3
4/11/12
How does the Median/Quantiles work
On Wed, Apr 11, 2012 at 8:17 AM, Amit <mahal...@gmail.com> wrote: > Can someone example
unread,
How does the Median/Quantiles work
On Wed, Apr 11, 2012 at 8:17 AM, Amit <mahal...@gmail.com> wrote: > Can someone example
4/11/12
James Newhaven
4/11/12
BagSplit Question
I've used BagSplit which generates a relation with the following schema: {datafu.pig.bags.
unread,
BagSplit Question
I've used BagSplit which generates a relation with the following schema: {datafu.pig.bags.
4/11/12
Johnny Zhang
2
3/2/12
got a error when run "ant test"
has been resolved, never mind, thanks On Mar 2, 3:45 pm, Johnny Zhang <xiao...@cloudera.com>
unread,
got a error when run "ant test"
has been resolved, never mind, thanks On Mar 2, 3:45 pm, Johnny Zhang <xiao...@cloudera.com>
3/2/12
Joseph Wang
,
Matt Hayes
10
2/10/12
pagerank input data
Hmm that could be a problem. We have only tested it on pig 0.9. On Fri, Feb 10, 2012 at 7:22 PM,
unread,
pagerank input data
Hmm that could be a problem. We have only tested it on pig 0.9. On Fri, Feb 10, 2012 at 7:22 PM,
2/10/12
Chris Diehl
,
Sam Shah
3
1/25/12
Build Failure
Hey Sam, Thanks! Definitely been a while since we've crossed paths. Hope all is well on your end.
unread,
Build Failure
Hey Sam, Thanks! Definitely been a while since we've crossed paths. Hope all is well on your end.
1/25/12
BgFu
,
Matt Hayes
3
1/6/12
one "TINY" question about defining a "BAG"
Thanks for the response! Yes, I'm following the pig version and the graph concept to understand
unread,
one "TINY" question about defining a "BAG"
Thanks for the response! Yes, I'm following the pig version and the graph concept to understand
1/6/12