The method you show is just going to re-sort the data 4 times, so that’s not going to work as expected.
There’s nothing to my knowledge in clojure that has `then-by` semantics, so the idiomatic approach is to put what you want to sort by into a vector. However, pig doesn’t support custom comparators, so that won’t work in pigpen. Distributed sorts aren’t amenable to traditional comparators anyway - it’s insufficient to just compare two values; what you really need is a way to partition them into equal buckets.
Ideally we would want to have some syntax that looks for something like this: (pig/sort-by (juxt :a :b :c :d)) and expand that into the pig syntax for sorting by multiple columns. But that’s kind of hacky & precludes using any other function that takes an input record and returns a vector of values to sort by. Another possibility would be to add an option that specifies the number of values to sort by and takes any arbitrary key-fn to return a vector of that many values. Those could then be exploded into the pig syntax & it should work.
So, right now there’s not a great way to do what you want. I could potentially add the aforementioned features, but it’ll take me a couple weeks as I’m heads down on another project at the moment. As a workaround, you could create a string sort key that would satisfy your sorting requirements using string ordering. This is super hacky and likely to be very error prone though. I know it’s cliche, but pull requests are welcome if you want to take a stab at doing it the right way. :)
Sorry for the unsatisfying answer, let me know if you have any more questions