http://en.wikipedia.org/wiki/Wikipedia:Database_download#English-language_Wikipedia
> --
> You received this message because you are subscribed to the Google Groups "link-grammar" group.
> To post to this group, send email to link-g...@googlegroups.com.
> To unsubscribe from this group, send email to link-grammar...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/link-grammar?hl=en.
>
>
--
"I like to pay taxes. With them, I buy civilization." -- Oliver Wendell Holmes
Are the Relex Parsed Wikipedia articles available?
Bruce Williams
Concepts, like individuals, have their histories and are just as incapable of
withstanding the ravages of time as are individuals. But in and
through all this
they retain a kind of homesickness for the scenes of their childhood.
Soren Kierkegaard
raw wikipedia dump:
enwiki-20080524-pages-articles.xml.bz2 16-Jul-2008 16:59 3.7G
stripped of wiki markup:
enwiki-20080524-alpha.tar.bz2 23-Jul-2008 20:54 1.6G
In the http://gnucash.org/linas/nlp/data/enwiki-20101011/ directory, likewise
--linas
Bruce Williams
Concepts, like individuals, have their histories and are just as incapable of
withstanding the ravages of time as are individuals. But in and
through all this
they retain a kind of homesickness for the scenes of their childhood.
Soren Kierkegaard
!? "a different parser" I presume the stanford parser? Note
that the wikiepedia articles were not parsed in stanford compatibility
mode, so I don't know how you expect to compare. And, given the
other recent email thread about how recent versions of link-grammar
bungled the constituent tree when a sentence contained and/or clauses,
comparing constituent trees will yeild ...poor results.
--linas