Some fun facts:
1. You could probably get a full history of Hacker News by carefully crawling the HNSearch API[0]. Chronically crawling the new page for links would allow you to stay in synch with Hacker News proper, and a monthly refresh of the data would probably be smart to hit anything you might miss.
2. The source code for an early version of Hacker News has been floating around, distributed with Arc if I remember correctly. From PG's comments, the main updates to the site has mainly been dealing with spammers and voting ring detection.
Combining the two together, it is quite possible to make a read-only version of hacker news with a full history and proper linking. Most of the changes have been made only to the write only sections. So, it would be pretty straightforward to clone hacker news and keep it cloned if you throw out the write parts. Additionally, if you got the lag for updates down below a few seconds, you could piggyback off hacker news and just redirect submissions and comments there (you could even really tricky with an iframe if that tickles your fancy). In that manner, you could clone the functionality of hacker news with a full history. It seems like this wouldn't even really take a whole summer to do. Maybe a week of concentrated effort, depending on how fast you can pull down the data from the API and how well you understand your stack.
-Zack