Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
Running pattern in Google App Engine
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  8 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Romain  
View profile  
 More options Apr 10 2011, 7:15 pm
From: Romain <romain.e...@gmail.com>
Date: Sun, 10 Apr 2011 16:15:37 -0700 (PDT)
Local: Sun, Apr 10 2011 7:15 pm
Subject: Running pattern in Google App Engine
Hi all,

I am trying to run a pattern based app in GAE and I am struggling with
the quotas of 10MB max per file.
I have zipped the whole pattern but this is 12MB.
I tried to zipsplit in 2 zips of max 10MB and then run this program :

import sys
sys.path.insert(0, 'pattern.zip')
import pattern
pattern.__path__.append('pattern2.zip/pattern')

from google.appengine.ext import webapp
from google.appengine.ext.webapp import util
from pattern.vector import Document
from pattern.search import Pattern, Constraint, Classifier, taxonomy,
search
from pattern.en     import Sentence, parse

etc...

then I get the following error:

Traceback (most recent call last):
  File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/
GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/
google/appengine/tools/dev_appserver.py", line 3858, in _HandleRequest
    self._Dispatch(dispatcher, self.rfile, outfile, env_dict)
  File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/
GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/
google/appengine/tools/dev_appserver.py", line 3792, in _Dispatch
    base_env_dict=env_dict)
  File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/
GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/
google/appengine/tools/dev_appserver.py", line 580, in Dispatch
    base_env_dict=base_env_dict)
  File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/
GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/
google/appengine/tools/dev_appserver.py", line 2918, in Dispatch
    self._module_dict)
  File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/
GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/
google/appengine/tools/dev_appserver.py", line 2822, in ExecuteCGI
    reset_modules = exec_script(handler_path, cgi_path, hook)
  File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/
GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/
google/appengine/tools/dev_appserver.py", line 2702, in
ExecuteOrImportScript
    exec module_code in script_module.__dict__
  File "/Users/romain/Documents/sites/testrvestr/main.py", line 28, in
<module>
    from pattern.en     import Sentence, parse
ImportError: No module named en

It look the loading of the libs is a bit broken.

Any idea or better approach to use Pattern on GAE ?

Thanks in advance ( and apologies because I am new to Python).

Romain.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Ross M Karchner  
View profile  
 More options Apr 11 2011, 8:56 am
From: Ross M Karchner <rosskarch...@gmail.com>
Date: Mon, 11 Apr 2011 08:56:24 -0400
Local: Mon, Apr 11 2011 8:56 am
Subject: Re: Running pattern in Google App Engine
Do you have to zip it?

--
Ross M Karchner

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Romain  
View profile  
 More options Apr 11 2011, 9:16 am
From: Romain <romain.e...@gmail.com>
Date: Mon, 11 Apr 2011 06:16:53 -0700 (PDT)
Local: Mon, Apr 11 2011 9:16 am
Subject: Re: Running pattern in Google App Engine
Maybe I was not clear (or I don't understand your question), so will
clarify:

zipping the pattern lib is > than 10MB.

So I am splitting it into 2 zips, each less that 10MB, but then the
loading of the lib gets broken.

On Apr 11, 1:56 pm, Ross M Karchner <rosskarch...@gmail.com> wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Tom De Smedt  
View profile  
 More options Apr 11 2011, 9:32 am
From: Tom De Smedt <tomdesm...@gmail.com>
Date: Mon, 11 Apr 2011 15:32:32 +0200
Local: Mon, Apr 11 2011 9:32 am
Subject: Re: Running pattern in Google App Engine
I have no experience with Google App Engine yet, but if the issue is  
about file size: the biggest files are in pattern/en/wordnet/. Leaving  
out the wordnet folder should keep file size below 10MB. You then have  
a Pattern module without WordNet, but all other functionality should  
work as documented.

On 11 Apr 2011, at 15:16, Romain wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Romain  
View profile  
 More options Apr 11 2011, 10:34 am
From: Romain <romain.e...@gmail.com>
Date: Mon, 11 Apr 2011 07:34:52 -0700 (PDT)
Local: Mon, Apr 11 2011 10:34 am
Subject: Re: Running pattern in Google App Engine
yep you are correct, the pb is the big data.noun file.
Of course an option is to not use wordnet at all but ... wordnet is
what I need :-o

Any thought on if this file data.noun could be split in 2 and then
loaded in 2 steps at run time ? ..

 I don't know the wordnet implementation and how it gets loaded.

On Apr 11, 2:32 pm, Tom De Smedt <tomdesm...@gmail.com> wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Tom De Smedt  
View profile  
 More options Apr 11 2011, 1:18 pm
From: Tom De Smedt <tomdesm...@gmail.com>
Date: Mon, 11 Apr 2011 19:18:06 +0200
Local: Mon, Apr 11 2011 1:18 pm
Subject: Re: Running pattern in Google App Engine
Rather than loading the files into memory, pywordnet (by Oliver  
Steele) will use a binary search on the index files, and then directly  
retrieve the offset it needs from the corresponding data file. This  
happens in the _lineAt() function in wordnet/pywordnet/wordnet.py. The  
solution would be to split data.noun into two files of 7MB. If it  
reads from the first file and EOF is encountered, it should instead  
read from the second file, something along the lines of:

import os, stat

p1 = "dict/data.noun1"
p2 = "dict/data.noun2"
f1 = file(p1,"rb")
f2 = file(p2,"rb")

offset = 20000000

f1.seek(offset)
line = f1.readline()
if len(line) == 0:
     f2.seek(offset - os.stat(p1)[stat.ST_SIZE])
     line = f2.readline()

print line

So is this a hack or a new feature? I can implement it in the next  
revision, but then I'll need some more time to do it carefully so  
there is no performance drop.

On 11 Apr 2011, at 16:34, Romain wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Ross M Karchner  
View profile  
 More options Apr 11 2011, 5:47 pm
From: Ross M Karchner <rosskarch...@gmail.com>
Date: Mon, 11 Apr 2011 17:47:06 -0400
Local: Mon, Apr 11 2011 5:47 pm
Subject: Re: Running pattern in Google App Engine
I was trying to feel out why you need to zip the library at all--
zipped libraries are a nice option if you're running into GAE's
file-count limit, but it's generally not a requirement.

I also *think* (not 100% sure) zipped libraries can't take advantage
of GAE's python pre-compilation. At least, there's a ticket asking for
that:

http://code.google.com/p/googleappengine/issues/detail?id=4634

--
Ross M Karchner

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Tom De Smedt  
View profile  
 More options Apr 13 2011, 7:54 am
From: Tom De Smedt <tomdesm...@gmail.com>
Date: Wed, 13 Apr 2011 13:54:54 +0200
Local: Wed, Apr 13 2011 7:54 am
Subject: Re: Running pattern in Google App Engine
I've made some changes to pywordnet to support partitioned data files.  
Pattern now uses a data.noun1 + data.noun2 instead of a single  
data.noun. Both files are below 10MB so this should enable you to  
upload Pattern to GAE. I've also upgraded to WordNet 3.0. You can grab  
the latest source code from http://code.google.com/p/pattern-for-
python or wait for the official new release (this should be available  
in the coming days).

On 11 Apr 2011, at 23:47, Ross M Karchner wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »