Benchmarking is hard.
If you really want to know why, you will have to look at the generated
assembly code. There are likely to be differences there. For
example, your 1a and 1b loops use a pointer to a global variable, but
your 0a and 0b loops use a global variable directly. In general
references through a pointer are more efficient than references to a
named global variable. I don't know if that is the difference here,
but it could be.
It could also be something difficult to control for, like loop alignment.
Ian