The 2014 Statistical Machine Translation workshop includes a parallel language corpus derived from Common Crawl which includes a EN-RU component of 878K sentences extracted from 123K pages on 21K domains: http://www.statmt.org/wmt14/translation-task.html
There are also some papers which describe previous efforts to extract parallel texts and language models from the CommonCrawl:
Smith, Jason R., Herve Saint-Amand, Magdalena Plamada, Philipp Koehn, Chris Callison-Burch, and Adam Lopez. "Dirt Cheap Web-Scale Parallel Text from the Common Crawl." In ACL (1), pp. 1374-1383. 2013.
Buck, Christian, Kenneth Heafield, and Bas van Ooyen. "N-gram counts and language models from the common crawl." In Proceedings of the Language Resources and Evaluation Conference. 2014.
As far as language stats go, the second paper reports the following breakdown (running CLD2 on the WET files):
Relative occurrence % Size
Language 2012 2013 both both
English 54.79 79.53 67.05 23.62 TiB
German 4.53 1.23 2.89 1.02 TiB
Spanish 3.91 1.68 2.80 986.86 GiB
French 4.01 1.14 2.59 912.16 GiB
Japanese 3.11 0.14 1.64 577.14 GiB
Russian 2.93 0.09 1.53 537.36 GiB
Polish 1.81 0.08 0.95 334.31 GiB
Italian 1.40 0.44 0.92 325.58 GiB
Portuguese 1.32 0.48 0.90 316.87 GiB
Chinese 1.45 0.04 0.75 264.91 GiB
Dutch 0.95 0.22 0.59 207.90 GiB
other 12.23 12.57 12.40 4.37 TiB
As you can see from the stats (apologies for the formatting, see the original paper for a better version), there's a significant shift towards English from 2012 to 2013 (from 55% to 80%) with a corresponding reduction in non-English languages (e.g. Russian drops from 3% to <0.1%).
Tom