I'm not aware of PCFG being applied to text compression, although in theory it might work. The top compressors like paq8hp12 and durilca4linux model language mostly at the level of n-grams and a crude semantic/part-of-speech model in which related words are grouped in a predefined dictionary and whole groups are used as context. I believe that durilca4linux used a clustering algorithm to find related words. However, its top position in the LTCB is probably because it uses twice as much memory. The model itself is more primitive because it uses PPM rather than context mixing, which restricts the semantic model to a 1-dimensional space.
In earlier work in language models, modeling grammar has not been as successful as modeling semantics using sparse n-grams or LSA, but that might be because the corpus was not large enough to learn English grammar. That might be a problem for enwik8, although I believe enwik9 should be large enough. enwik8 is about the amount of language that a 3 year old child is exposed to, so it seems to me to be too small to train an adult level grammar model on it.
-- Matt Mahoney, matma...@yahoo.com
--- On Fri, 9/19/08, nadam <na...@freemail.hu> wrote: