Well, but it's still fun; I once created a similar collection for Ben (and specifically, for an English-language teaching website), and it was interesting to see how words are connected to other words. One or several of the opencog blog entries dealt with the network statistics on such a graph; Joel then created a visualizations of some 20K or 100K or whatever of the most-connected word pairs, and Ben then demoed this at some conference. I think this was 2009.
FWIW, my data was not "context free", but rather the result of ranking word pairs within the context of the sentences that they occur in. That is, a pair of words were associated with each other only if there was not some other, better association in the sentence. Also, word pairs need not be adjeacent, there could have been intervening words that factored out in other ways.
The point is that, when you squint, this really does offer a first approximation to an association between words and concepts. The google post doesn't really emphasize this, but when you crawl the graph, it does become clear: the word pair "Northern Ireland" really does almost surely refer to the political/geographical entity that you think it does, with a very, very high probability.
The key word here is "probability". So, yes, as you point out, "you can't parse a sentence until you know what it means", but the reverse is also true. If we work in the framework of Bayesian nets (or any similar network framework), then there's a back-n-forth: "Northern Ireland" provides a Bayesian prior: a very good place to start the parse is to assume "Northern Ireland" is a compound noun, and go from there. In the end, the parse may not allow this, but its a good place to start.
To re-iterate: such a graph is a first approximation to the connection between words and concepts. People have also talked about and taken the next step (something that I too, would very much like to replicate): build the graph of verbs that are likely to connect concepts: so e.g. bicycles can be ridden because people ride bikes. Being ride-able is a property (attribute) of the concept bicycle. By knowing the network of such attributes, one can now parse all that much more accurately: again, as a Bayesian prior: if we believe that a bicycle is ride-able with probability 0.95, then I can infer that the sentence: "I put on my bicycle helmet and rode out there." almost surely means that I rode a bicycle. Whatever.
Anyway, if one thinks of language, and concepts, and understanding as network graphs, then the google dataset is an interesting approximation. One still needs to have agents that can crawl over the network, add and remove nodes, strengthen and loosen connections, or qualify and intermediate between them, and also perform logical deduction, inference, etc. on them. The hard part is creating such agents, and getting them to work properly.