Hey Ankur,
I like the idea of a comparison matrix. We tried to do something similar with Hadoop already (parts of it are on the front page of our website), which we used for a local summit here. Comparing Stratosphere to Spark in this way would be a natural extension to this. ;-)
Internally, we ran some benchmarks against 0.7.3 (unfortunately right before the 0.8 release). We didn't publish the results as there are certain aspects that make the comparison unfair (for example we have no fault tolerance right now whereas Spark does). As soon as we (re-)introduce fault tolerance mechanisms, we will re-run the benchmarks.
I can publish the code for the Stratosphere and Spark programs we looked at on GitHub. If I add Scala versions of the Stratosphere programs, this will also go to your proposed direction of having a direct comparison.
Is there any specific use case where you want to see numbers? Or is it more like a general thing where you want to see how both systems perform?
Best,
Ufuk