I'm currently investigating PigPen for some annoyingly formatted data. I have a command line Clojure script that currently uses a regex to turn it into the map reduce format I need:
(defn my-data
[]
(clojure.string/split (slurp "inputfixed.txt") #"\n"))
(defn myrun
[]
(->> (my-data)
(map #(rest (re-find #"\| ([.a-zA-Z]+)\|.+(az67a|trffz)" %)))
(into () (filter issafe))
..
I had hopes of being able to port this to PigPen largely by replacing my-data with a call to pig/load-string. However, whatever I do after this, pig/dump just returns (), before and after trying to use pig/map and pig/filter.
I've found if I rewrite my input so that I can use load-tsv, I can make PigPen work. ie, I believe I'm generally using PigPen correctly. Of course, reformatting all input before handing it to PigPen is not my ideal situation.
Any assistance appreciated.
--
You received this message because you are subscribed to the Google Groups "PigPen Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pigpen-suppor...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Thanks for looking at this.
Unfortunately I don't know what to tell you - I was working on a minimal sample that demonstrated the problem without pasting tonnes of irrelevant code, and found that if I changed the dependencies from Clojure 1.6.0 (as per tutorial) to Clojure 1.8.0 (as per what Leiningen defaulted to) I lose the ability to replicate.
I hate tickets that aren't conclusive myself but I'll come back if I find anything definite to paste.