On Mon, Jul 19, 2010 at 3:34 PM, nathanmarz <natha...@gmail.com> wrote:
> Here's what I'm thinking for design. Let me know what you think.
>
> There will be two ways to add traps to a query or set of queries. The
> first is with "with-trap", i.e.:
>
> (with-trap (hfs-textline "/tmp/mytrap")
> (?<- ...)
> (?- ...)
> )
This seems straightforward and matches my understanding of the with-*
semantics. I haven't used traps in normal Cascading code, so I'm
curious: would these traps keep a hfs file open? Or open a file in a
directory for each expression?
> (with-trap-map {"error-subquery" (hfs-textline "/tmp/mytrap")}
> (let [sq (<- [?f1 ?f2] (source ?f1) (possible-error-op ?f1 :> ?f2)
> (possible-error-filter ?f1) (:name "error-subquery"))]
> (?<- (hfs-textline "/tmp/results") [?f3] (sq _ ?f2) (* 2 ?f2 :> ?f3)))
This seems pretty awkward as it requires linking up failure cases
between what could be a large number of lines. Again, I'm not sure
how this would work implementation wise, but I think something like
(<- [?f1 ?f2] (source ?f1) (possible-error-op ?f1 :> ?f2) (:trap
(hfs-textline "/tmp/mytrap")))
might work better. Then you have the trap embedded in the query.
Having this usage also seems like it would make the with-trap easier
to implement (just inject the (:trap ) clause in each expression).
Jim
It works as you would expect.
One trap directory for each trap. one part file for each mapper/reducer.
The part file stays open once opened, if ever, so it can accept additional tuples.
Hadoop MR does not support appends, so we couldn't close and reopen if we wanted too.
If you want a trap for each expression, you need to name each pipe individually and bind a trap to it.
Traps are not a device for filtering. they only exist to capture exceptional unanticipated cases when you don't want the job to stop in the face of them.
ckw
--
Chris K Wensel
ch...@concurrentinc.com
http://www.concurrentinc.com
For more options, visit https://groups.google.com/d/optout.--
You received this message because you are subscribed to the Google Groups "cascalog-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cascalog-use...@googlegroups.com.
Same here...seems to work for me too...