Lines of code/man-months in HiStar?

47 views
Skip to first unread message

David Zhang

unread,
Jan 16, 2013, 3:45:32 AM1/16/13
to stanford-...@googlegroups.com
As a software developer, I'm curious about the scope of HiStar though I know it's not really relevant to the subject of the paper.

On page 100/Figure 7, the total LOC of the HiStar webserver is about 1.4 million, and of course a webserver is not even part of the core OS itself. Is this typical for a research operating system? How is it that this many lines of code are able to make it into a non-commercial piece of software? (Are there vast underground legions of grad students hammering away at HiStar?)

Ted Kim

unread,
Jan 16, 2013, 4:11:05 AM1/16/13
to stanford-...@googlegroups.com
I'd like to second this question (+1 for images of underground legions).

This is a tangent, but I was recently reading about Doom3 source code (link if interested) with ~600k lines of code in that production game, albeit it's almost 10 years old now.  1.4M is pretty huge..

-ted

Frank Chen

unread,
Jan 16, 2013, 4:33:47 AM1/16/13
to Ted Kim, stanford-...@googlegroups.com
I cloned the source code of HiStar from http://www.scs.stanford.edu/histar/gitrepo/ and ran a line of code counter on it (ignoring whitespace). Here are some statistics:

-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
C                             2118          41494          49316         254002
C/C++ Header                  1677          27066          38442         107512
C++                            265           6888           2365          30720
Assembly                       389           4468           6895          22135

Furthermore, I believe a lot of the code is from ./pkg/uclibc [a small C library for Linux by Erik Andersen]. For instance, locale_data.c is 20,000 lines by itself, regex_old.c is another 5,500 lines, etc... So the source code written by the research is probably not on the order of even one hundred thousand lines. Of course, I could be wrong though.

Frank




--
 
 

ivan

unread,
Jan 16, 2013, 4:34:48 AM1/16/13
to stanford-...@googlegroups.com
http://www.scs.stanford.edu/histar/src/ :)

The page cites 4 people working on it though.

I find it interesting to be used as a solution to security concerns like a user having access to a file, but restricting the file usage on a single device like just a monitor, or a specific monitor inside a building, or just allowing copies and r/w access of a file to a certain specific hard drive to prevent leaks to thumb drives or external HD's.  

I wonder if a similar concept was ever applied to some other programs like chrome, if webapps are going in the direction of being able to use the device hardware would it be possible to secure access to a very granular level, or pre-parse for security concerns an app given its source code + labels to block malware.

Deian Stefan

unread,
Jan 16, 2013, 9:49:30 AM1/16/13
to stanford-...@googlegroups.com
So 1.4M LOC for anything security-oriented should worry you. Fortunately, all the security checks etc. are in the small kernel (TCB), so everything built on top is effectively untrusted. The kernel itself is roughly 15K LOC (very tight, readable code too!). The CACM paper doesn't talk about this, but you can checkout section 4.1 of the original SOSP paper


On Wednesday, January 16, 2013 12:45:32 AM UTC-8, David Zhang wrote:

Deian Stefan

unread,
Jan 16, 2013, 9:55:37 AM1/16/13
to stanford-...@googlegroups.com
We've been arguing for this in the SCS for a while and have implemented it on top of Native Client  and, more recently, Dune (project led by Andrea Bittau).
We also started hacking on V8 to do this, so lookout for a submission in the coming months :)
Reply all
Reply to author
Forward
0 new messages