The current implementation of the sum goes in this way,
For arrays longer than a certain threshold (1024, 2048?), it recursively divide the array into two halves, and sum different halves respectively. As the part to be summed gets shorter, it resorts to a four-way sequential summing algorithm, i.e. x[i], x[i+1], x[i+2], x[i+3] are summed to four accumulators.
I guess better implementations would be developed, and we may switch to an SIMD-based summing algorithm in future.
So, never count on the `sum` function to sum the values in any specific order.
Dahua