hard to tell why it is so slow, but it should depend entirely on the time taken for each cp2k evaluation. with your input it needs to evaluate 5 energies per step, and if you have a single cp2k instance running, your 130 steps should be about 650 energy evaluations. 4 days seems extremely slow for a small water box, at the level I'd expect if you were running on a single CPU. so I recommend you first check the timings for running cp2k on its own, checking it's actually compiled with parallel execution. then, as Mariana says, you can run multiple cp2k instances, but this is far too slow to start with, and it's a problem of your cp2k setup, not of the interaction with i-pi
cheers