Thank you!
One other question - are entries such as the following (in the 2010
version of the data) intended?
<aetna.n.57 url="
http://www.ybr.com/aetna" title="Aetna Retiree
Connection Newsletter - Ybr.com
">
<TargetSentence />
I guess at the moment I assuming these are web pages that had no text?
There appear to be entries like this in a few files...
grep -B 1 "TargetSentence /" *.xml
aetna.n.xml- <aetna.n.57 url="
http://www.ybr.com/aetna"
title="Aetna Retiree Connection Newsletter - Ybr.com">
aetna.n.xml: <TargetSentence />
--
bald_eagle.n.xml- <bald_eagle.n.23
url="
http://bensguide.gpo.gov/3-5/symbols/eagle.html" title="Bald
Eagle - Ben&#39;s Guide to U.S. Government for Kids - U.S. ...">
bald_eagle.n.xml: <TargetSentence />
--
bermuda_triangle.n.xml- <bermuda_triangle.n.15
url="
http://www.scubagreenville.com/" title="Bermuda Triangle,
Greenville, SC">
bermuda_triangle.n.xml: <TargetSentence />
--
black_planet.n.xml- <black_planet.n.26
url="
http://www.blackplanetrising.com/" title="BlackPlanet Rising">
black_planet.n.xml: <TargetSentence />
--
boomerang.n.xml- <boomerang.n.53
url="
http://www.theboomeranggroup.com/" title="The Boomerang Group">
boomerang.n.xml: <TargetSentence />
--
century_21.n.xml- <century_21.n.63 url="
http://mobile.c21.com/"
title="Moving &amp; Relocation - Century 21">
century_21.n.xml: <TargetSentence />
--
dsl.n.xml- <dsl.n.29 url="
http://dsl.org/" title="Dsl.org by Michael Stutz">
dsl.n.xml: <TargetSentence />
--
dsl.n.xml- <dsl.n.41 url="
http://www.qwest.com/dsl/" title="DSL -
High Speed Internet Access - Qwest">
dsl.n.xml: <TargetSentence />
--
eros.n.xml- <eros.n.46 url="
http://www.eros-os.com/" title="The
EROS Group, LLC">
eros.n.xml: <TargetSentence />
--
jurassic_park.n.xmlä¸åä¸å¤§å¦[æµ·å¤ç«] - Mississippi State
University">~brb1/jurassic.html" titlejurassic_park.n.xml:
<TargetSentence />
--
kawasaki.n.xml- <kawasaki.n.33
url="
http://www.italee.com/kawa/main.htm" title="KAZUO KAWASAKI by
ITALEE">
kawasaki.n.xml: <TargetSentence />
--
lemonade_stand.n.xml- <lemonade_stand.n.48
url="
http://makeastandlemonade.com/" title="MAKEaSTAND!">
lemonade_stand.n.xml: <TargetSentence />
--
match_point.n.xml- <match_point.n.8
url="
http://www.matchpoint.dreamworks.com/" title="Match Point -
DreamWorks Animation">
match_point.n.xml: <TargetSentence />
--
mortal_kombat.n.xml- <mortal_kombat.n.28
url="
http://www.worldscollide.com/" title="Mortal Kombat vs. DC
Universe">
mortal_kombat.n.xml: <TargetSentence />
--
queen.n.xml- <queen.n.21 url="
http://www.queen.net/" title="Queen">
queen.n.xml: <TargetSentence />
--
romeo_and_juliet.n.xml- <romeo_and_juliet.n.30
url="
http://operacolorado.org/%3Fpage_id%3D169" title="Romeo and
Juliet Opera Colorado">
romeo_and_juliet.n.xml: <TargetSentence />
--
virgo.n.xml- <virgo.n.25 url="
http://virgo.lib.virginia.edu/"
title="Virgo - University of Virginia">
virgo.n.xml: <TargetSentence />
--
virgo.n.xml- <virgo.n.33 url="
http://www.virgo.infn.it/"
title="Virgo - Infn">
virgo.n.xml: <TargetSentence />
Thanks!
Ted