Re: [nltk-users] Re: Google App Engine

1,031 views
Skip to first unread message

Adam Oakman

unread,
Jul 14, 2009, 9:18:27 PM7/14/09
to no...@comp.leeds.ac.uk, er...@comp.leeds.ac.uk, nltk-...@googlegroups.com
Hi Nora,
Thanks for that, I agree decoupling things seems to be the way it will
need to happen. I don't everything but I am utilizing the Porter stemmer,
Punkt tokenizing and Brill tagging, so its enough to be slightly painful. I
was initially just seeing what would be involved to get it running and then
concentrate on only what is needed to preserve some kind of future
upgradability but I guess I need to dig in and start decoupling what I need.

Thanks for your input

Adam


On 7/14/09 4:19 AM, "Noorhan Abbas" <sc...@leeds.ac.uk> wrote:

> Hello,
> I have deployed some nltk files on Google App Engine server! Well, I only
> needed the Lancaster Stemmer. I wrote to Dr. Edward Loper about the problem of
> dependencies and he advised me on how to detach the stemmer from the rest of
> the nltk files.
>
> If you really want to deploy all the nltk files and the data files on GAE
> server, bear in mind the total number of files you are permitted to upload on
> the server. The total number of files should not exceed 1000 files! another
> issue is the size of the files....GAE server will not allow you to upload very
> large files. I am afraid this is not documented in the library.
>
> Do you really want to deploy all the nltk files or just some modules?
>
> Best of luck,
> Nora
>
> ________________________________________
> From: Eric Atwell [gorgeouss...@googlemail.com]
> Sent: 13 July 2009 23:19
> To: no...@comp.leeds.ac.uk; oakmad; er...@comp.leeds.ac.uk
> Subject: Fwd: [nltk-users] Re: Google App Engine
>
> Nora,
>
> can you reply with any useful advice? You seem to be the only person
> to successfully include NLTK in a google appengine website!
>
> thanks
>
> eric
>
>
> ---------- Forwarded message ----------
> From: oakmad <adam....@gmail.com>
> Date: 2009/7/4
> Subject: [nltk-users] Re: Google App Engine
> To: nltk-users <nltk-...@googlegroups.com>
>
>
>
> Yes I deployed it per the Google App Engine instrutions, you cannot
> deploy it into say site-packages as the installer does. Essentially
> this involves creating a NLTK directory in the root of the application
> structure
>
> On Jul 4, 1:48 pm, Shaul Kedem <shaul.ke...@gmail.com> wrote:
>> did you deploy it as it should be deployed? because these dependencies
>> are part of the regular nltk install
>>
>> On Sat, Jul 4, 2009 at 1:37 PM, oakmad<adam.oak...@gmail.com> wrote:
>>
>>> Hi,
>>> Has anyone successfully deployed NLTK (2.0b3) to Google App
>>> Engine? I am investigating it currently and bumping into many
>>> dependency errors, things like NLTK searching for the nltk_data
>>> direcory, sqllite and numpty so far. Just wondering if anyone has any
>>> tips before I dig right in to work it out.
>>
>>> Thanks
>>
>>> Adam
>

Pedro Marcal

unread,
Jul 15, 2009, 4:40:52 AM7/15/09
to nltk-...@googlegroups.com, Adam Oakman
Hi Adam, Nora,
This was a project that I was going to get around to ie trying the google App Engine. It seems to me that the better solution to this problem is to contact the Google App Engine people and ask them to install nltk in their user Library. We certainly have the user base to support a request ''that will serve a dedicated user community of Computational Linguists". Perhaps someone with more official clout for NLTK can contact Google on our behalf.
Regards,
Pedro

Justin Olmanson

unread,
Jul 15, 2009, 11:08:58 AM7/15/09
to nltk-...@googlegroups.com
Great idea!

I'm working on a project that uses NLTK's WordNet and Grammar Tagging tools

I'd love to be able to use GAE instead of my shared server space.  :-)

best,

Justin

oakmad

unread,
Aug 2, 2009, 8:38:13 PM8/2/09
to nltk-users
For those that are interested, I managed to successfully work through
deploying several NLTK modules to GAE. It was actually a lot easier
than I expected. Essentially the main issue lies in the NLTK
__init__.py files which load lots of things (from what I gather) that
I wasn't going to be using. To start create a nltk directory in your
GAE environment and populate it with an empty __init__.py. At this
point its a case of copying whichever modules required into this
directory, if you need a subdirectory create it and make sure you
place an empty __init__.py file in there (my Python newbie-ness caught
me out with this). I then created a test page that just imported the
modules I planned on using, and loading this would trigger different
include errors, and I followed the ball adding in the required
modules. The only time this didn't work was with accuracy in metrics,
which is in the scores file; I replicated the standard init file and
just included the scores.py (minor). So far I have from stem.porter,
tag.brill, tag.sequential and tokenize.punkt working for my needs. Its
pretty quick. I've taken the approach of using pickles where possible;
like with my trained Brill data. So far I'm happy with the result.

Hope that helps...

Adam

On Jul 15, 10:08 am, Justin Olmanson <olman...@gmail.com> wrote:
> Great idea!
>
> I'm working on a project that uses NLTK's WordNet and Grammar Tagging tools
>
> I'd love to be able to use GAE instead of my shared server space.  :-)
>
> best,
>
> Justin
>
>
>
> On Wed, Jul 15, 2009 at 3:40 AM, Pedro Marcal <marca...@cox.net> wrote:
>
> > Hi Adam, Nora,
> > This was a project that I was going to get around to ie trying the google
> > App Engine. It seems to me that the better solution to this problem is to
> > contact the Google App Engine people and ask them to install nltk in their
> > user Library. We certainly have the user base to support a request ''that
> > will serve a dedicated user community of Computational Linguists". Perhaps
> > someone with more official clout for NLTK can contact Google on our behalf.
> > Regards,
> > Pedro
> > > > From: Eric Atwell [gorgeoussupervi...@googlemail.com]
> > > > Sent: 13 July 2009 23:19
> > > > To: n...@comp.leeds.ac.uk; oakmad; e...@comp.leeds.ac.uk
Reply all
Reply to author
Forward
0 new messages