<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?>
<feed xmlns="http://www.w3.org/2005/Atom">
  <id>http://groups.google.com/group/nltk-users</id>
  <title type="text">nltk-users Google Group</title>
  <subtitle type="text">
  Discussion forum for users of the Natural Language Toolkit. When joining, please describe your interest so we know you are not a spammer. For help with corpus processing, see: http://gandalf.aksis.uib.no/corpora/ For help with Python programming, see: http://groups.google.com/group/comp.lang.python
  </subtitle>
  <link href="/group/nltk-users/feed/atom_v1_0_msgs.xml" rel="self" title="nltk-users feed"/>
  <updated>2013-06-18T06:31:41Z</updated>
  <generator uri="http://groups.google.com" version="1.99">Google Groups</generator>
  <entry>
  <author>
  <name>Vinit Kumar Upadhyay</name>
  <email>vinitkumarupadh...@gmail.com</email>
  </author>
  <updated>2013-06-18T06:31:41Z</updated>
  <id>http://groups.google.com/group/nltk-users/browse_thread/thread/48fe7ad1ca627ec1/3745388ce7be7fff?show_docid=3745388ce7be7fff</id>
  <link href="http://groups.google.com/group/nltk-users/browse_thread/thread/48fe7ad1ca627ec1/3745388ce7be7fff?show_docid=3745388ce7be7fff"/>
  <title type="text">Tagging phrases with type</title>
  <summary type="html" xml:space="preserve">
  Hi, &lt;br&gt; I am chunking certain phrases and then marking type before POS tagging &lt;br&gt; them. For example, &amp;quot;windows xp&#39;&#39; as OS. I have a corpus where phrases are &lt;br&gt; classified as per their type. I want an efficient nltk tool to perform this &lt;br&gt; task. I looked at parsers but context free grammars require complete &lt;br&gt; coverage which I cant provide because there will be some terminals in my
  </summary>
  </entry>
  <entry>
  <author>
  <name>Richard Marsden</name>
  <email>winw...@gmail.com</email>
  </author>
  <updated>2013-06-17T22:54:20Z</updated>
  <id>http://groups.google.com/group/nltk-users/browse_thread/thread/43a6ea2ccc67ee64/22cc411b2b7e8f41?show_docid=22cc411b2b7e8f41</id>
  <link href="http://groups.google.com/group/nltk-users/browse_thread/thread/43a6ea2ccc67ee64/22cc411b2b7e8f41?show_docid=22cc411b2b7e8f41"/>
  <title type="text">Parsing, Feature Structures, and Semantics</title>
  <summary type="html" xml:space="preserve">
  I am trying to extract semantics from sentences, and using Python &amp;amp; NLTK as &lt;br&gt; a learning toolkit. &lt;br&gt; &lt;p&gt;If I understand the (Bird et al) book and the NLTK Semantics HowTo, first &lt;br&gt; order semantics are extracted from a parse tree using root_semrep(). &lt;br&gt; However, root_semrep() requires a parse tree with feature structures.
  </summary>
  </entry>
  <entry>
  <author>
  <name>Jacob Perkins</name>
  <email>jap...@gmail.com</email>
  </author>
  <updated>2013-06-17T22:36:59Z</updated>
  <id>http://groups.google.com/group/nltk-users/browse_thread/thread/b51e9f8ffed4fd5a/1ef369f2f5c8ca1a?show_docid=1ef369f2f5c8ca1a</id>
  <link href="http://groups.google.com/group/nltk-users/browse_thread/thread/b51e9f8ffed4fd5a/1ef369f2f5c8ca1a?show_docid=1ef369f2f5c8ca1a"/>
  <title type="text">Re: parsing issues on -ing and -ed words</title>
  <summary type="html" xml:space="preserve">
  Hi Gene, &lt;br&gt; You can override tags from the default tagger by creating your own tagger, &lt;br&gt; as described &lt;br&gt; in &lt;a target=&quot;_blank&quot; rel=nofollow href=&quot;http://stackoverflow.com/questions/5919355/custom-tagging-with-nltk/5922373#5922373&quot;&gt;[link]&lt;/a&gt; &lt;br&gt; Or, you can try training your own pos tagger: &lt;br&gt; &lt;a target=&quot;_blank&quot; rel=nofollow href=&quot;http://streamhacker.com/2011/03/21/training-part-speech-taggers-nltk-trainer/&quot;&gt;[link]&lt;/a&gt;
  </summary>
  </entry>
  <entry>
  <author>
  <name>Jacob Perkins</name>
  <email>jap...@gmail.com</email>
  </author>
  <updated>2013-06-17T22:32:11Z</updated>
  <id>http://groups.google.com/group/nltk-users/browse_thread/thread/85e8f25f816c48b9/ce7af95b123027b0?show_docid=ce7af95b123027b0</id>
  <link href="http://groups.google.com/group/nltk-users/browse_thread/thread/85e8f25f816c48b9/ce7af95b123027b0?show_docid=ce7af95b123027b0"/>
  <title type="text">Re: phrasal replacement?</title>
  <summary type="html" xml:space="preserve">
  Hi Gene, &lt;br&gt; The CsvWordReplacer &amp;amp; RegexReplacer are examples from my cookbook, so &lt;br&gt; you&#39;re welcome to modify them however you want. If you do make useful &lt;br&gt; modifications, please share them - I&#39;d happily post them on &lt;br&gt; &lt;a target=&quot;_blank&quot; rel=nofollow href=&quot;http://streamhacker.com&quot;&gt;[link]&lt;/a&gt; or maybe include them in a second edition of the &lt;br&gt; cookbook (with your permission).
  </summary>
  </entry>
  <entry>
  <author>
  <name>gowtham a</name>
  <email>gowtham1993...@gmail.com</email>
  </author>
  <updated>2013-06-16T12:23:18Z</updated>
  <id>http://groups.google.com/group/nltk-users/browse_thread/thread/812d919d6910b4a1/545292e5a4d9b154?show_docid=545292e5a4d9b154</id>
  <link href="http://groups.google.com/group/nltk-users/browse_thread/thread/812d919d6910b4a1/545292e5a4d9b154?show_docid=545292e5a4d9b154"/>
  <title type="text">problem with LazyCorpusLoader</title>
  <summary type="html" xml:space="preserve">
  i&#39;m using nltk.corpus.util.LazyCorpusLoa der() to load wsj treebank mrg &lt;br&gt; files with tag_mapping_function=nltk.tag. simplify argument but even &lt;br&gt; after doing so i&#39;m getting non terminals like &#39;NP-SBJ-1&#39; &lt;br&gt; &lt;p&gt;this is the line of code i&#39;m using to load the treebank &lt;br&gt; treebank=nltk.corpus.util.Lazy CorpusLoader(&#39;treebank/
  </summary>
  </entry>
  <entry>
  <author>
  <name>Miles</name>
  <email>wenhaofu0...@gmail.com</email>
  </author>
  <updated>2013-06-15T08:19:09Z</updated>
  <id>http://groups.google.com/group/nltk-users/browse_thread/thread/28e4ca4d7b7913fe/e6d0dd794d8268f7?show_docid=e6d0dd794d8268f7</id>
  <link href="http://groups.google.com/group/nltk-users/browse_thread/thread/28e4ca4d7b7913fe/e6d0dd794d8268f7?show_docid=e6d0dd794d8268f7"/>
  <title type="text">Discussing Exercise 24 and 27 of Ch02 of the Book</title>
  <summary type="html" xml:space="preserve">
  Hello, everyone, I started reading the book, NLP with Python, recently, and &lt;br&gt; have just finished Chap 02 last night. In completing this chapter, I find &lt;br&gt; Exercise 24 very interesting and would love to discuss with you guys. &lt;br&gt; Exercise 24 approaches the random text generation function. A primitive &lt;br&gt; version of this function, which, given the seed word, chooses every next
  </summary>
  </entry>
  <entry>
  <author>
  <name>rubics_cube</name>
  <email>gene.d...@gmail.com</email>
  </author>
  <updated>2013-06-14T01:31:29Z</updated>
  <id>http://groups.google.com/group/nltk-users/browse_thread/thread/85e8f25f816c48b9/f5984463938e4574?show_docid=f5984463938e4574</id>
  <link href="http://groups.google.com/group/nltk-users/browse_thread/thread/85e8f25f816c48b9/f5984463938e4574?show_docid=f5984463938e4574"/>
  <title type="text">phrasal replacement?</title>
  <summary type="html" xml:space="preserve">
  I wanted to use the CsvWordReplacer to create a phrasal replacement &lt;br&gt; file for things such as idioms and phrasal prepositions. &lt;br&gt; &lt;p&gt;Examples: &lt;br&gt; kicked the bucket &amp;gt;&amp;gt; kicked_the_bucket &lt;br&gt; because of &amp;gt;&amp;gt; because_of &lt;br&gt; in fact &amp;gt;&amp;gt; in_fact &lt;br&gt; &lt;p&gt;I&#39;m running into two issues with this. &lt;br&gt; CsvWordReplacer expects single tokens, not multiples and/or sentences.
  </summary>
  </entry>
  <entry>
  <author>
  <name>rubics_cube</name>
  <email>gene.d...@gmail.com</email>
  </author>
  <updated>2013-06-14T01:15:02Z</updated>
  <id>http://groups.google.com/group/nltk-users/browse_thread/thread/b51e9f8ffed4fd5a/ab0fab98a68046a7?show_docid=ab0fab98a68046a7</id>
  <link href="http://groups.google.com/group/nltk-users/browse_thread/thread/b51e9f8ffed4fd5a/ab0fab98a68046a7?show_docid=ab0fab98a68046a7"/>
  <title type="text">parsing issues on -ing and -ed words</title>
  <summary type="html" xml:space="preserve">
  I&#39;m noticing some rather odd parsing results in NLTK that I&#39;d like to &lt;br&gt; find a way to correct. I would welcome any suggestions anyone might &lt;br&gt; have, especially if you&#39;ve encountered this issue and have found a &lt;br&gt; solution. &lt;br&gt; &lt;p&gt;Try the following sentence: &lt;br&gt; &amp;quot;The pyramids, for example, bring the eye to a point, and that point
  </summary>
  </entry>
  <entry>
  <author>
  <name>noorsakinah_UTeM</name>
  <email>sakinahshaee...@gmail.com</email>
  </author>
  <updated>2013-06-09T12:04:13Z</updated>
  <id>http://groups.google.com/group/nltk-users/browse_thread/thread/7ce0f77d683c0e48/a9d3e8eaf51bcbb6?show_docid=a9d3e8eaf51bcbb6</id>
  <link href="http://groups.google.com/group/nltk-users/browse_thread/thread/7ce0f77d683c0e48/a9d3e8eaf51bcbb6?show_docid=a9d3e8eaf51bcbb6"/>
  <title type="text">categorized corpus</title>
  <summary type="html" xml:space="preserve">
  Hi, &lt;br&gt; my situation is just like movie_review corpus &lt;br&gt; my folder name is commenFb, so inside this folder have 2 folder, i label it &lt;br&gt; one is positif another negatif. &lt;br&gt; The problem is when i call reader.categories(),there is nothing display. &lt;br&gt; So, how i want to make this categories call my folder. &lt;br&gt; this my coding.
  </summary>
  </entry>
  <entry>
  <author>
  <name>Alexis Dimitriadis</name>
  <email>alexis.dimitria...@gmail.com</email>
  </author>
  <updated>2013-06-06T19:27:01Z</updated>
  <id>http://groups.google.com/group/nltk-users/browse_thread/thread/89c3806f37cc8c10/5278aa603dbac720?show_docid=5278aa603dbac720</id>
  <link href="http://groups.google.com/group/nltk-users/browse_thread/thread/89c3806f37cc8c10/5278aa603dbac720?show_docid=5278aa603dbac720"/>
  <title type="text">Re: [nltk-users] Re: create my own corpus</title>
  <summary type="html" xml:space="preserve">
  You are welcome. &lt;br&gt; &lt;p&gt;Your corpus is ready to use, it&#39;s just loaded differently from the &lt;br&gt; NLTK&#39;s built-in corpora. If you look inside the source for nltk.corpus, &lt;br&gt; you&#39;ll see something like this (simplifying for exposition): &lt;br&gt; &lt;p&gt; brown = CategorizedTaggedCorpusReader( ...) &lt;br&gt; &lt;p&gt;It&#39;s just a trick, in other words. When you import brown, you import a
  </summary>
  </entry>
  <entry>
  <author>
  <name>Nana Dureyna</name>
  <email>nanaoz_na...@hotmail.com</email>
  </author>
  <updated>2013-06-06T06:28:05Z</updated>
  <id>http://groups.google.com/group/nltk-users/browse_thread/thread/89c3806f37cc8c10/6de46eeab1d59806?show_docid=6de46eeab1d59806</id>
  <link href="http://groups.google.com/group/nltk-users/browse_thread/thread/89c3806f37cc8c10/6de46eeab1d59806?show_docid=6de46eeab1d59806"/>
  <title type="text">Re: create my own corpus</title>
  <summary type="html" xml:space="preserve">
  the PlaintextCorpusReader is work. &lt;br&gt; from nltk.corpus.reader import PlaintextCorpusReader &lt;br&gt; os.path.join(&#39;C:\\Users\\Win 7\\nltk_data&#39;, &amp;quot;corpora/muet&amp;quot;) &lt;br&gt; corpus_root = &#39;C:\\Users\\Win 7\\nltk_data\\corpora/muet&#39; &lt;br&gt; wordlists = PlaintextCorpusReader(corpus_r oot,&#39;.*&#39;) &lt;br&gt; wordlists.fileids() &lt;br&gt; when i want to import the muet such as (from nltk.corpus import muet) to
  </summary>
  </entry>
  <entry>
  <author>
  <name>Alexis Dimitriadis</name>
  <email>alexis.dimitria...@gmail.com</email>
  </author>
  <updated>2013-06-05T12:04:49Z</updated>
  <id>http://groups.google.com/group/nltk-users/browse_thread/thread/89c3806f37cc8c10/2e5c729e3a1223a3?show_docid=2e5c729e3a1223a3</id>
  <link href="http://groups.google.com/group/nltk-users/browse_thread/thread/89c3806f37cc8c10/2e5c729e3a1223a3?show_docid=2e5c729e3a1223a3"/>
  <title type="text">Re: [nltk-users] create my own corpus</title>
  <summary type="html" xml:space="preserve">
  Nana, &lt;br&gt; &lt;p&gt;Take a loot at the documentation for the PlaintextCorpusReader class. &lt;br&gt; You call it with the (relative) path to your corpus, e.g., &lt;br&gt; os.path.join(nltk.data.root, &amp;quot;corpora/muet&amp;quot;), and a pattern matching the &lt;br&gt; filenames, and you get a corpus object with methodsfileids(), words(), &lt;br&gt; sents(), etc.
  </summary>
  </entry>
  <entry>
  <author>
  <name>Nana Dureyna</name>
  <email>nanaoz_na...@hotmail.com</email>
  </author>
  <updated>2013-06-05T06:46:36Z</updated>
  <id>http://groups.google.com/group/nltk-users/browse_thread/thread/89c3806f37cc8c10/1a22a0b98ab093d3?show_docid=1a22a0b98ab093d3</id>
  <link href="http://groups.google.com/group/nltk-users/browse_thread/thread/89c3806f37cc8c10/1a22a0b98ab093d3?show_docid=1a22a0b98ab093d3"/>
  <title type="text">create my own corpus</title>
  <summary type="html" xml:space="preserve">
  I want to create my own corpus. &lt;br&gt; I have my own folder document which i label it as &amp;quot;muet&amp;quot;. &lt;br&gt; I already copy this folder into the path where corpora place. &lt;br&gt; In this folder, i have more than one file text. &lt;br&gt; So, how i want to load this folder into the NLTK, so that NLTK can access &lt;br&gt; my file.
  </summary>
  </entry>
  <entry>
  <author>
  <name>Samith Dassanayake</name>
  <email>hisam...@gmail.com</email>
  </author>
  <updated>2013-06-01T07:12:30Z</updated>
  <id>http://groups.google.com/group/nltk-users/browse_thread/thread/04a8a9908ade6305/1f5da846426fee21?show_docid=1f5da846426fee21</id>
  <link href="http://groups.google.com/group/nltk-users/browse_thread/thread/04a8a9908ade6305/1f5da846426fee21?show_docid=1f5da846426fee21"/>
  <title type="text">Named Entity Recognition with lower case words?</title>
  <summary type="html" xml:space="preserve">
  Hi all, &lt;br&gt; I am developing a software which analyze the social media and come with new &lt;br&gt; trends and I am using nltk to achieve my goals. &lt;br&gt; My question is In nltk when we use the NER feature, if we input &lt;br&gt; the sentence &amp;quot;*I live in New York*.&amp;quot; , it will identify &amp;quot;New York&amp;quot; as a * &lt;br&gt; Location*. &lt;br&gt; But if the input is &amp;quot;*I live in new york*.&amp;quot; it wont recognize &amp;quot;new york&amp;quot; as
  </summary>
  </entry>
  <entry>
  <author>
  <name>Subhabrata</name>
  <email>subhabangal...@gmail.com</email>
  </author>
  <updated>2013-05-30T19:34:02Z</updated>
  <id>http://groups.google.com/group/nltk-users/browse_thread/thread/310d69ae8c1f200a/701ee924764e6521?show_docid=701ee924764e6521</id>
  <link href="http://groups.google.com/group/nltk-users/browse_thread/thread/310d69ae8c1f200a/701ee924764e6521?show_docid=701ee924764e6521"/>
  <title type="text">Question on Tagging</title>
  <summary type="html" xml:space="preserve">
  Dear Group, &lt;br&gt; From web surfing I came to know the simple &amp;quot;nltk.pos_tag(text)&amp;quot; is based on &lt;br&gt; Maximum Entropy Model. &lt;br&gt; Is the information fine? &lt;br&gt; Regards, &lt;br&gt; Subhabrata.
  </summary>
  </entry>
</feed>
