I cloned the source code of HiStar from
http://www.scs.stanford.edu/histar/gitrepo/ and ran a line of code counter on it (ignoring whitespace). Here are some statistics:
-------------------------------------------------------------------------------
Language files blank comment code
-------------------------------------------------------------------------------
C 2118 41494 49316 254002
C/C++ Header 1677 27066 38442 107512
C++ 265 6888 2365 30720
Assembly 389 4468 6895 22135
Furthermore, I believe a lot of the code is from ./pkg/uclibc [a small C library for Linux by Erik Andersen]. For instance, locale_data.c is 20,000 lines by itself, regex_old.c is another 5,500 lines, etc... So the source code written by the research is probably not on the order of even one hundred thousand lines. Of course, I could be wrong though.