Steve N sent a quick query off-line, but I'd prefer to answer the general gist of his message here which were:
1. query re: how would one replace missing values in a column in a dataframe
2. suggest to look into HADOOP
For the first comment/question: I'll put the query and example into examples/04-dataManipulation.lisp, but in the future, it would save time for me if folks could add queries directly into the file in comments at the end of the file, in the right "commented" section, and then I (or someone else) can implement. So basically, checkout or fork a copy, branch onto local-YOUR-UNIQUE-NAME, put back into github, and request a pull so we can share.
In specific, Steve suggested looking at one of the large public databases, which is a good idea, since I can use the getting data section (probably write an examples/03-gettingWWWdata.lisp file) to demonstrate how to fetch and make accessible, using CLS. I probably will use a smaller public-use dataset (something on the order of 1-10Mb, not 100Mb or Gb) so that the example files do not take forever.
For the second, HADOOP, there is clearly a equivalence of interfaces -- Common Lisp invented map-reduce strategies eons ago, though not the parallelisation across machines. But for the specific issue ... we'd need a tie-in to the systems, and I've not got the bandwidth right now to do it. But then, it ought to be a simple matter of using LPARALLEL (which does have such structures) to do the dispatch to the lower level HADOOP infrastructure. That claim is in fact a throw-away, based on limited reading of the APIs and literature for those two systems. I haven't looked at feasibility.
As always, things are moving, faster right now, probably with a December slowdown as work piles up, some coding time between christmas and new years when I'm in the mountains, and hopefully, train-located coding time as I try to spend weekends in the mountains this year.
best,
-tony
blind...@gmail.comMuttenz, Switzerland.
"Commit early,commit often, and commit in a repository from which we can easily roll-back your mistakes" (AJR, 4Jan05).
Drink Coffee: Do stupid things faster with more energy!