To answer your questions:
1) The javascript implementation uses rhino, which is not fast at all (rhino is slow by nature). But there is no benchmark comparing ruby implementation with rhino (you could try, but I don't expect any performance improvement). If you are using Java 8, you could create a custom processor using nashorn (I'm planning to add support in future), which is comparable to V8 js engine. But this require some work on your side.
I would also suggest the following approach: precompile (using build-time solution) some of the sass libraries and use sass only for a small subset of resources.
2) The cache is supposed to work for the second call. If it doesn't, than it is likely a bug. As far as I remember, I have covered with unit tests this scenario and there was no changes for long time on that part of code. If you can prove this bug (try debugging), let me know and I will open an issue and will fix it for next release.
Cheers,
Alex