<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?>
<feed xmlns="http://www.w3.org/2005/Atom">
  <id>http://groups.google.com/group/nytnlp</id>
  <title type="text">The New York Times Annotated Corpus Community Google Group</title>
  <subtitle type="text">
  The New York Times Natural Language Processing group exists to provide a community for researchers working with The New York Times Annotated Corpus. This group is maintained by the New York Times and should be used as a forum for discussing any and all matters relating to the corpus.
  </subtitle>
  <link href="/group/nytnlp/feed/atom_v1_0_msgs.xml" rel="self" title="The New York Times Annotated Corpus Community feed"/>
  <updated>2009-09-16T23:24:09Z</updated>
  <generator uri="http://groups.google.com" version="1.99">Google Groups</generator>
  <entry>
  <author>
  <name>Evan Sandhaus &lt;sandhes@nytimes.com&gt;</name>
  <email>kan...@gmail.com</email>
  </author>
  <updated>2009-09-16T23:24:09Z</updated>
  <id>http://groups.google.com/group/nytnlp/browse_thread/thread/0a9a04a7e98439ae/2825fec50508fdfd?show_docid=2825fec50508fdfd</id>
  <link href="http://groups.google.com/group/nytnlp/browse_thread/thread/0a9a04a7e98439ae/2825fec50508fdfd?show_docid=2825fec50508fdfd"/>
  <title type="text">Spam Messages</title>
  <summary type="html" xml:space="preserve">
  We seem to be experiencing an upsurge in spam posted to our group. I &lt;br&gt; apologize for the unwelcome traffic and assure you that I will remain &lt;br&gt; vigilant about scrubbing our group of all inappropriate content. &lt;br&gt; &lt;p&gt;All the best, &lt;br&gt; &lt;p&gt;Evan
  </summary>
  </entry>
  <entry>
  <author>
  <name>Evan Sandhaus &lt;sandhes@nytimes.com&gt;</name>
  <email>kan...@gmail.com</email>
  </author>
  <updated>2009-04-14T21:18:33Z</updated>
  <id>http://groups.google.com/group/nytnlp/browse_thread/thread/6878fb4c35588864/68c32d8e16bb2342?show_docid=68c32d8e16bb2342</id>
  <link href="http://groups.google.com/group/nytnlp/browse_thread/thread/6878fb4c35588864/68c32d8e16bb2342?show_docid=68c32d8e16bb2342"/>
  <title type="text">Call for prototypes, demos and research summaries</title>
  <summary type="html" xml:space="preserve">
  To the NYT Annotated Corpus Community, &lt;br&gt; &lt;p&gt;Recently, I was invited to be invited to deliver the closing keynote &lt;br&gt; address at the 2009 Semantic Technologies Conference in San Jose &lt;br&gt; California. Part of my remarks will focus on The New York Times &lt;br&gt; Annotated Corpus and I hope to include examples of the exciting work
  </summary>
  </entry>
  <entry>
  <author>
  <name>Evan Sandhaus &lt;sandhes@nytimes.com&gt;</name>
  <email>kan...@gmail.com</email>
  </author>
  <updated>2009-04-14T21:06:26Z</updated>
  <id>http://groups.google.com/group/nytnlp/browse_thread/thread/4d24d7f6638b6640/8d7823c14c8195c1?show_docid=8d7823c14c8195c1</id>
  <link href="http://groups.google.com/group/nytnlp/browse_thread/thread/4d24d7f6638b6640/8d7823c14c8195c1?show_docid=8d7823c14c8195c1"/>
  <title type="text">Re: Queries for Ad-hoc Retrieval Experiments on NYT Corpus</title>
  <summary type="html" xml:space="preserve">
  All, &lt;br&gt; &lt;p&gt;I realize that this doesn&#39;t provide a whole lot of queries but its &lt;br&gt; better than nothing. &lt;br&gt; &lt;p&gt;&lt;a target=&quot;_blank&quot; rel=nofollow href=&quot;http://www.nytimes.com/gst/mostsearched.html&quot;&gt;[link]&lt;/a&gt; &lt;br&gt; &lt;p&gt;You get the top 25 daily, weekly, and monthly queries on nytimes.com &lt;br&gt; along with related queries for each query. The data is updated &lt;br&gt; hourly. I&#39;ll poke around at The Times and see if an expanded version
  </summary>
  </entry>
  <entry>
  <author>
  <name>Daniel Tunkelang</name>
  <email>dtunkel...@gmail.com</email>
  </author>
  <updated>2009-04-14T20:37:30Z</updated>
  <id>http://groups.google.com/group/nytnlp/browse_thread/thread/4d24d7f6638b6640/694eab5d6c7d388a?show_docid=694eab5d6c7d388a</id>
  <link href="http://groups.google.com/group/nytnlp/browse_thread/thread/4d24d7f6638b6640/694eab5d6c7d388a?show_docid=694eab5d6c7d388a"/>
  <title type="text">Re: Queries for Ad-hoc Retrieval Experiments on NYT Corpus</title>
  <summary type="html" xml:space="preserve">
  For that matter, there are also the daily Google &lt;br&gt; Trends&amp;lt;&lt;a target=&quot;_blank&quot; rel=nofollow href=&quot;http://www.google.com/trends/hottrends&quot;&gt;[link]&lt;/a&gt;&amp;gt;queries, available as &lt;br&gt; far back as May &lt;br&gt; 15, 2007 &amp;lt;&lt;a target=&quot;_blank&quot; rel=nofollow href=&quot;http://www.google.com/trends/hottrends?sa=X&amp;date=2007-5-15&quot;&gt;[link]&lt;/a&gt;&amp;gt;. &lt;br&gt; &lt;p&gt;Daniel &lt;br&gt; &lt;p&gt;-- &lt;br&gt; Daniel Tunkelang &lt;br&gt; Chief Scientist, Endeca &lt;br&gt; Blog: &lt;a target=&quot;_blank&quot; rel=nofollow href=&quot;http://thenoisychannel.com/&quot;&gt;[link]&lt;/a&gt;
  </summary>
  </entry>
  <entry>
  <author>
  <name>Brendan O&#39;Connor</name>
  <email>breno...@gmail.com</email>
  </author>
  <updated>2009-04-14T20:30:46Z</updated>
  <id>http://groups.google.com/group/nytnlp/browse_thread/thread/4d24d7f6638b6640/2338022418875af6?show_docid=2338022418875af6</id>
  <link href="http://groups.google.com/group/nytnlp/browse_thread/thread/4d24d7f6638b6640/2338022418875af6?show_docid=2338022418875af6"/>
  <title type="text">Re: Queries for Ad-hoc Retrieval Experiments on NYT Corpus</title>
  <summary type="html" xml:space="preserve">
  I&#39;d be interested in hearing about this too, if anyone has looked into it. &lt;br&gt; &lt;p&gt;The lack of availability of real-world query logs is a huge impediment to &lt;br&gt; information retrieval research. At least in web search, it seems like &lt;br&gt; private companies (Google, Yahoo, Microsoft) are making academics obsolete
  </summary>
  </entry>
  <entry>
  <author>
  <name>kberberi</name>
  <email>klaus.berber...@gmail.com</email>
  </author>
  <updated>2009-04-14T12:38:33Z</updated>
  <id>http://groups.google.com/group/nytnlp/browse_thread/thread/4d24d7f6638b6640/fd116e2727d77d94?show_docid=fd116e2727d77d94</id>
  <link href="http://groups.google.com/group/nytnlp/browse_thread/thread/4d24d7f6638b6640/fd116e2727d77d94?show_docid=fd116e2727d77d94"/>
  <title type="text">Queries for Ad-hoc Retrieval Experiments on NYT Corpus</title>
  <summary type="html" xml:space="preserve">
  Hi, &lt;br&gt; &lt;p&gt;I was wondering whether anyone has conducted ad-hoc retrieval &lt;br&gt; experiments using the NYT corpus. &lt;br&gt; &lt;p&gt;If so, which queries did you use? Of course it would be great if a &lt;br&gt; &amp;quot;real&amp;quot; workload (no click log needed!) was available (e.g., as &lt;br&gt; observed on the site search of New York Times). &lt;br&gt; &lt;p&gt;Thanks a lot and kind regards,
  </summary>
  </entry>
  <entry>
  <author>
  <name>Evan Sandhaus &lt;sandhes@nytimes.com&gt;</name>
  <email>kan...@gmail.com</email>
  </author>
  <updated>2009-03-30T13:50:46Z</updated>
  <id>http://groups.google.com/group/nytnlp/browse_thread/thread/46882e8f4a0a36e3/1e1fccd47c0f2ea1?show_docid=1e1fccd47c0f2ea1</id>
  <link href="http://groups.google.com/group/nytnlp/browse_thread/thread/46882e8f4a0a36e3/1e1fccd47c0f2ea1?show_docid=1e1fccd47c0f2ea1"/>
  <title type="text">Re: Why doesn&#39;t the IEEE stop the Fraud and the scam? Why doesn&#39;t the IEEE stop the Fraud and the scam?</title>
  <summary type="html" xml:space="preserve">
  This post is off topic for this group, and has been removed.
  </summary>
  </entry>
  <entry>
  <author>
  <name>Evan Sandhaus &lt;sandhes@nytimes.com&gt;</name>
  <email>kan...@gmail.com</email>
  </author>
  <updated>2009-01-13T17:57:00Z</updated>
  <id>http://groups.google.com/group/nytnlp/browse_thread/thread/988e3a374c578b2c/adb95abc182df85e?show_docid=adb95abc182df85e</id>
  <link href="http://groups.google.com/group/nytnlp/browse_thread/thread/988e3a374c578b2c/adb95abc182df85e?show_docid=adb95abc182df85e"/>
  <title type="text">Re: Happy 2009 From The New York Times</title>
  <summary type="html" xml:space="preserve">
  Neal, &lt;br&gt; &lt;p&gt;To your questions: &lt;br&gt; &lt;p&gt;The kind of work you describe is not permitted under the LDC &lt;br&gt; license. That being said, we are very open to discussing a &lt;br&gt; commercial licensing arrangement to our data. Please follow up with &lt;br&gt; me at sand...@nytimes.com to discuss specifics. &lt;br&gt; &lt;p&gt;The controlled indexing vocabulary is &amp;quot;included&amp;quot; with the corpus
  </summary>
  </entry>
  <entry>
  <author>
  <name>Evan Sandhaus &lt;sandhes@nytimes.com&gt;</name>
  <email>kan...@gmail.com</email>
  </author>
  <updated>2009-01-13T17:48:13Z</updated>
  <id>http://groups.google.com/group/nytnlp/browse_thread/thread/988e3a374c578b2c/3678c055b9fac63d?show_docid=3678c055b9fac63d</id>
  <link href="http://groups.google.com/group/nytnlp/browse_thread/thread/988e3a374c578b2c/3678c055b9fac63d?show_docid=3678c055b9fac63d"/>
  <title type="text">Re: Happy 2009 From The New York Times</title>
  <summary type="html" xml:space="preserve">
  Dan, &lt;br&gt; &lt;p&gt;The Times and I understand the value of public facing prototypes and &lt;br&gt; will look into this issue. As soon as we have a definitive answer we &lt;br&gt; will report it back to this forum. &lt;br&gt; &lt;p&gt;All the best, &lt;br&gt; &lt;p&gt;Evan
  </summary>
  </entry>
  <entry>
  <author>
  <name>Evan Sandhaus &lt;sandhes@nytimes.com&gt;</name>
  <email>kan...@gmail.com</email>
  </author>
  <updated>2009-01-13T17:48:01Z</updated>
  <id>http://groups.google.com/group/nytnlp/browse_thread/thread/988e3a374c578b2c/ffd24a2d2a24050e?show_docid=ffd24a2d2a24050e</id>
  <link href="http://groups.google.com/group/nytnlp/browse_thread/thread/988e3a374c578b2c/ffd24a2d2a24050e?show_docid=ffd24a2d2a24050e"/>
  <title type="text">Re: Happy 2009 From The New York Times</title>
  <summary type="html" xml:space="preserve">
  Dan, &lt;br&gt; &lt;p&gt;The Times and I understand the value of public facing prototypes and &lt;br&gt; will look into this issue. As soon as we have a definitive answer we &lt;br&gt; will report it back to this forum. &lt;br&gt; &lt;p&gt;All the best, &lt;br&gt; &lt;p&gt;Evan
  </summary>
  </entry>
  <entry>
  <author>
  <name>Neal Richter</name>
  <email>nrich...@gmail.com</email>
  </author>
  <updated>2009-01-13T04:45:39Z</updated>
  <id>http://groups.google.com/group/nytnlp/browse_thread/thread/988e3a374c578b2c/c3d6c0a63a26a2c5?show_docid=c3d6c0a63a26a2c5</id>
  <link href="http://groups.google.com/group/nytnlp/browse_thread/thread/988e3a374c578b2c/c3d6c0a63a26a2c5?show_docid=c3d6c0a63a26a2c5"/>
  <title type="text">Re: Happy 2009 From The New York Times</title>
  <summary type="html" xml:space="preserve">
  Evan, &lt;br&gt; &lt;p&gt; I see that the license is non-commercial. Some of the Licenses in &lt;br&gt; the LDC allow creating derivative data. &lt;br&gt; &lt;p&gt;Example: &lt;br&gt; b. Summaries, analyses and interpretations of the linguistic &lt;br&gt; properties of the text may be derived and published, provided it is &lt;br&gt; not possible to reconstruct the information from these summaries.
  </summary>
  </entry>
  <entry>
  <author>
  <name>Daniel Tunkelang</name>
  <email>dtunkel...@gmail.com</email>
  </author>
  <updated>2009-01-12T15:21:23Z</updated>
  <id>http://groups.google.com/group/nytnlp/browse_thread/thread/988e3a374c578b2c/e7307573cf523cc2?show_docid=e7307573cf523cc2</id>
  <link href="http://groups.google.com/group/nytnlp/browse_thread/thread/988e3a374c578b2c/e7307573cf523cc2?show_docid=e7307573cf523cc2"/>
  <title type="text">Re: Happy 2009 From The New York Times</title>
  <summary type="html" xml:space="preserve">
  Evan, &lt;br&gt; &lt;p&gt;What are the guidelines around using the corpus in publicly facing &lt;br&gt; demonstrations? Such demos could generate massive publicity around the &lt;br&gt; corpus, not to mention spurring a healthy competition among information &lt;br&gt; access vendors. &lt;br&gt; &lt;p&gt;Daniel &lt;br&gt; &lt;p&gt;On Mon, Jan 12, 2009 at 8:10 AM, Evan Sandhaus &amp;lt;sand...@nytimes.com&amp;gt; &amp;lt;
  </summary>
  </entry>
  <entry>
  <author>
  <name>Evan Sandhaus &lt;sandhes@nytimes.com&gt;</name>
  <email>kan...@gmail.com</email>
  </author>
  <updated>2009-01-12T13:10:29Z</updated>
  <id>http://groups.google.com/group/nytnlp/browse_thread/thread/988e3a374c578b2c/02a81a201b732182?show_docid=02a81a201b732182</id>
  <link href="http://groups.google.com/group/nytnlp/browse_thread/thread/988e3a374c578b2c/02a81a201b732182?show_docid=02a81a201b732182"/>
  <title type="text">Happy 2009 From The New York Times</title>
  <summary type="html" xml:space="preserve">
  Hello from The New York Times and Happy 2009, &lt;br&gt; &lt;p&gt;Thank you all for joining The New York Times Annotated Corpus &lt;br&gt; Community. My colleagues and I created this group in hopes of &lt;br&gt; fostering a vibrant research community around The New York Times &lt;br&gt; Annotated Corpus, and we are confident that 2009 will be a great year
  </summary>
  </entry>
  <entry>
  <author>
  <name>Paweł</name>
  <email>pavel.ma...@gmail.com</email>
  </author>
  <updated>2008-12-16T01:33:48Z</updated>
  <id>http://groups.google.com/group/nytnlp/browse_thread/thread/329e63a888ce3572/21a1bc3d95dba5e7?show_docid=21a1bc3d95dba5e7</id>
  <link href="http://groups.google.com/group/nytnlp/browse_thread/thread/329e63a888ce3572/21a1bc3d95dba5e7?show_docid=21a1bc3d95dba5e7"/>
  <title type="text">API: getting the publication date</title>
  <summary type="html" xml:space="preserve">
  Hi, &lt;br&gt; &lt;p&gt;In the NYTCorpusDocument class it would be good to have a method to &lt;br&gt; return the original publication date string, not the Date object. I am &lt;br&gt; referring to the value of nitf/head/pubda...@date.public ation &lt;br&gt; (sorry if I missed sth and it is already possible to do it). &lt;br&gt; &lt;p&gt;Pawel
  </summary>
  </entry>
  <entry>
  <author>
  <name>Jeryl Cook</name>
  <email>twoenc...@gmail.com</email>
  </author>
  <updated>2008-11-24T18:16:20Z</updated>
  <id>http://groups.google.com/group/nytnlp/browse_thread/thread/9ca091cd6622f60b/1c358899ff037048?show_docid=1c358899ff037048</id>
  <link href="http://groups.google.com/group/nytnlp/browse_thread/thread/9ca091cd6622f60b/1c358899ff037048?show_docid=1c358899ff037048"/>
  <title type="text">Re: Obtaining the corpus</title>
  <summary type="html" xml:space="preserve">
  i can&#39;t believe there is a cost for people to obtain this... &lt;br&gt; &lt;p&gt;On Nov 6, 11:30 am, &amp;quot;Evan Sandhaus &amp;lt;sand...@nytimes.com&amp;gt;&amp;quot;
  </summary>
  </entry>
</feed>
