Lines of code/man-months in HiStar?

David Zhang

unread,

Jan 16, 2013, 3:45:32 AM1/16/13

to stanford-...@googlegroups.com

As a software developer, I'm curious about the scope of HiStar though I know it's not really relevant to the subject of the paper.

On page 100/Figure 7, the total LOC of the HiStar webserver is about 1.4 million, and of course a webserver is not even part of the core OS itself. Is this typical for a research operating system? How is it that this many lines of code are able to make it into a non-commercial piece of software? (Are there vast underground legions of grad students hammering away at HiStar?)

Ted Kim

unread,

Jan 16, 2013, 4:11:05 AM1/16/13

to stanford-...@googlegroups.com

I'd like to second this question (+1 for images of underground legions).

This is a tangent, but I was recently reading about Doom3 source code (link if interested) with ~600k lines of code in that production game, albeit it's almost 10 years old now. 1.4M is pretty huge..

-ted

Frank Chen

unread,

Jan 16, 2013, 4:33:47 AM1/16/13

to Ted Kim, stanford-...@googlegroups.com

I cloned the source code of HiStar from http://www.scs.stanford.edu/histar/gitrepo/ and ran a line of code counter on it (ignoring whitespace). Here are some statistics:

-------------------------------------------------------------------------------

Language files blank comment code

-------------------------------------------------------------------------------

C 2118 41494 49316 254002

C/C++ Header 1677 27066 38442 107512

C++ 265 6888 2365 30720

Assembly 389 4468 6895 22135

Furthermore, I believe a lot of the code is from ./pkg/uclibc [a small C library for Linux by Erik Andersen]. For instance, locale_data.c is 20,000 lines by itself, regex_old.c is another 5,500 lines, etc... So the source code written by the research is probably not on the order of even one hundred thousand lines. Of course, I could be wrong though.

Frank

--

ivan

unread,

Jan 16, 2013, 4:34:48 AM1/16/13

to stanford-...@googlegroups.com

http://www.scs.stanford.edu/histar/src/ :)

The page cites 4 people working on it though.

I find it interesting to be used as a solution to security concerns like a user having access to a file, but restricting the file usage on a single device like just a monitor, or a specific monitor inside a building, or just allowing copies and r/w access of a file to a certain specific hard drive to prevent leaks to thumb drives or external HD's.

I wonder if a similar concept was ever applied to some other programs like chrome, if webapps are going in the direction of being able to use the device hardware would it be possible to secure access to a very granular level, or pre-parse for security concerns an app given its source code + labels to block malware.

Deian Stefan

unread,

Jan 16, 2013, 9:49:30 AM1/16/13

to stanford-...@googlegroups.com

So 1.4M LOC for anything security-oriented should worry you. Fortunately, all the security checks etc. are in the small kernel (TCB), so everything built on top is effectively untrusted. The kernel itself is roughly 15K LOC (very tight, readable code too!). The CACM paper doesn't talk about this, but you can checkout section 4.1 of the original SOSP paper

On Wednesday, January 16, 2013 12:45:32 AM UTC-8, David Zhang wrote:

Deian Stefan

unread,

Jan 16, 2013, 9:55:37 AM1/16/13

to stanford-...@googlegroups.com

We've been arguing for this in the SCS for a while and have implemented it on top of Native Client and, more recently, Dune (project led by Andrea Bittau).
We also started hacking on V8 to do this, so lookout for a submission in the coming months :)

Reply all

Reply to author

Forward