Not sure what you mean by optimizing it internally or a secondary sort, but the pig/sort command uses pig’s order-by command [1] under the hood.
This is actually the command you probably want. It does a parallel sort by partitioning the data into n reducers, sorting each of them individually, and then merging the results.
The pig/fold command is intended to reduce all of the data into a single value, so that’s probably the worst case for performance of sorting a large dataset. The fold/sort function there is just included for completeness, but probably has very little real world use.
-Matt