Store class at process start, and use it to initiate inside function?

60 views
Skip to first unread message

Lukasz Szybalski

unread,
Oct 6, 2018, 6:00:34 PM10/6/18
to pylons-...@googlegroups.com
Hello,
I have an enterprise system that is creating a class but it takes a long time to initiate. About 2 sec, 90K _setitem from pickle. Nothing to profile, since OS cashes the file its as fast as it gets. 

I'm trying to find a way in pyramid where I can:

#store below at start, of the process, let it initiate,
#then somehow make it read only,
#so that a process can use it later and modify its own copy

from enterprise import Contract
my= Contract()


#rest of the program
my2 = copy my (or copy on write, similar how qcow format works)
my2.find_contract('abc')
my2.add_name('Lucas')
return (my2.stuff)

I don't seem to be finding the right terminology, or technique to do this, and where to place it in pyramid?

Thanks
Lucas

Michael Merickel

unread,
Oct 7, 2018, 1:59:58 AM10/7/18
to Pylons
This sounds like an application-global object. These are typically stored on the registry at config-time. For example, in your main you could set config.registry.foo = contract. The registry is available as request.registry and subsequently anything you add to it. You can see lots of examples of this in pyramid addons and things like the dbsession_factory in the alchemy cookiecutter.

--
You received this message because you are subscribed to the Google Groups "pylons-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pylons-discus...@googlegroups.com.
To post to this group, send email to pylons-...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/pylons-discuss/CAKkTUv3qgT%2BUk0-uvLB1owZEA3W%3D-7XA-wkiyZbteWHPAcO6vg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Lukasz Szybalski

unread,
Oct 8, 2018, 12:20:13 PM10/8/18
to pylons-discuss


On Sunday, October 7, 2018 at 12:59:58 AM UTC-5, Michael Merickel wrote:
This sounds like an application-global object. These are typically stored on the registry at config-time. For example, in your main you could set config.registry.foo = contract. The registry is available as request.registry and subsequently anything you add to it. You can see lots of examples of this in pyramid addons and things like the dbsession_factory in the alchemy cookiecutter.

Thank you.

I decided to add it at 1st run of the function:
try:
    if not request.registry.mycontract:
            request.registry.mycontract = Contract()
            mycontract=copy.copy(request.registry.mycontract)
...
#rest of the code:
mycontract.add_user()
mycontract.update_terms()

Sidenote:
Is there a way to force this object to be "read only" or now allow modification, to prevent somebody else in some other sections of the code accidently modifies request.registry.mycontract?

**update run at 0.42755 sec now.

Thanks
Lucas

Michael Merickel

unread,
Oct 8, 2018, 1:10:20 PM10/8/18
to Pylons
If you are doing loading of data at "first run of the function" then you have introduced a race condition in your app where unless you do appropriate locking, two threads (most wsgi servers serve a request per thread) may both consider themselves the first run and load the data. The only way to do this without locks is to do things at config-time like I suggested before. There are hacks that you can do because it's Python with a GIL in which you can do locking in more lightweight ways in the "first run of the function" case but I do not recommend relying on that behavior.

As far as defining an object as read-only, there is nothing specific to Pyramid here and you'll have to find a satisfactory solution in the rest of Python world.


Mike Orr

unread,
Oct 8, 2018, 1:34:06 PM10/8/18
to pylons-...@googlegroups.com
On Mon, Oct 8, 2018 at 9:20 AM Lukasz Szybalski <szyb...@gmail.com> wrote:
> Is there a way to force this object to be "read only" or now allow modification, to prevent somebody else in some other sections of the code accidently modifies request.registry.mycontract?

def __setattr__(self, attr, value):
if self.__locked:
raise AttributeError("read-only (locked) object")

Thierry Florac

unread,
Oct 9, 2018, 2:01:00 AM10/9/18
to pylons-...@googlegroups.com
And how do you handle such use case when working in a multi-process/multi-hosts cluster configuration?


For more options, visit https://groups.google.com/d/optout.


--

Tres Seaver

unread,
Oct 9, 2018, 12:00:31 PM10/9/18
to pylons-...@googlegroups.com
On 10/09/2018 02:00 AM, Thierry Florac wrote:

> And how do you handle such use case when working in a multi-process /
> multi-hosts cluster configuration?

Fork first, then run the configure step. Forking before creating stateful
globals is considered best practice for multi-processing.


Tres.
--
===================================================================
Tres Seaver +1 540-429-0999 tse...@palladion.com
Palladion Software "Excellence by Design" http://palladion.com

Bert JW Regeer

unread,
Oct 9, 2018, 12:05:51 PM10/9/18
to Pylons Project
You run the same code at configure time and load it once per process. Even if you do this once at request time or once at configure time you have to do it once per process.

Bert JW Regeer

unread,
Oct 9, 2018, 12:08:35 PM10/9/18
to Pylons Project
I would disagree, heavily. You want to create your globals once, then fork. This way the memory used by said global can be shared between all of the processes. Instagram even added the ability to freeze items so that they don't go through the normal GC cycle and thus don't accidentally cause COW on those objects:

https://instagram-engineering.com/copy-on-write-friendly-python-garbage-collection-ad6ed5233ddf
> --
> You received this message because you are subscribed to the Google Groups "pylons-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to pylons-discus...@googlegroups.com.
> To post to this group, send email to pylons-...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/pylons-discuss/ppij6f%24ol3%241%40blaine.gmane.org.

Jonathan Vanasco

unread,
Oct 9, 2018, 12:19:24 PM10/9/18
to pylons-discuss


On Tuesday, October 9, 2018 at 12:08:35 PM UTC-4, Bert JW Regeer wrote:
I would disagree, heavily. You want to create your globals once, then fork. This way the memory used by said global can be shared between all of the processes.

+1

bhe...@uniqueinsuranceco.com

unread,
Oct 17, 2018, 2:27:19 PM10/17/18
to pylons-discuss
Thanks guys for all your help.. This made our requests go from seconds to fractions of a second using the method Michael described.
Message has been deleted

Lukasz Szybalski

unread,
Nov 29, 2018, 12:11:55 AM11/29/18
to pylons-discuss


On Monday, October 8, 2018 at 12:10:20 PM UTC-5, Michael Merickel wrote:
If you are doing loading of data at "first run of the function" then you have introduced a race condition in your app where unless you do appropriate locking, two threads (most wsgi servers serve a request per thread) may both consider themselves the first run and load the data. The only way to do this without locks is to do things at config-time like I suggested before.


Hello,
How would one handle the following?
We have added the initation of the class and establish connection at config time in
__init__.py

config.registry.MY = MYContract()

This definatelly works, but now we are getting into issues where we get connection reset by peer. I guess in the other scenerio we connected every time, so every time we established a new connection. Now we are re-using the connection which causes below:
How can I try/except this at the __init__.py level in the config,...or what do I do in views.py to except and redo "config.registry.MY = MYContract()" to fix the connection issue.



[Thu Nov 01 09:50:28.215262 2018] [wsgi:error] [pid 26861:tid 140425465390848] [remote a]
[Thu Nov 01 09:50:28.215271 2018] [wsgi:error] [pid 26861:tid 140425465390848] [remote a] Traceback (most recent call last):
[Thu Nov 01 09:50:28.215277 2018] [wsgi:error] [pid 26861:tid 140425465390848] [remote a]   File "zzzzzzzzzzz/raven/utils/serializer/manager.py", line 76, in transform
[Thu Nov 01 09:50:28.215283 2018] [wsgi:error] [pid 26861:tid 140425465390848] [remote a]     return repr(value)
[Thu Nov 01 09:50:28.215295 2018] [wsgi:error] [pid 26861:tid 140425465390848] [remote a] TypeError: __repr__ returned non-string (type bytes)
[Thu Nov 01 09:50:28.215326 2018] [wsgi:error] [pid 26861:tid 140425465390848] [remote a]
[Thu Nov 01 09:50:28.244058 2018] [wsgi:error] [pid 26861:tid 140425465390848] [remote a] mod_wsgi (pid=26861): Exception occurred processing WSGI script '/zzzzzzzzzzzzz.wsgi'.
..........l, headers=headers)
[Thu Nov 01 09:50:28.247647 2018] [wsgi:error] [pid 26861:tid 140425465390848] [remote a]   File "zzzzzzzzzzzz/python3.5/site-packages/httplib2/__init__.py", line 1322, in request
[Thu Nov 01 09:50:28.247653 2018] [wsgi:error] [pid 26861:tid 140425465390848] [remote a]     (response, content) = self._request(conn, authority, uri, request_uri, method, body, headers, redirections, cachekey)
[Thu Nov 01 09:50:28.247663 2018] [wsgi:error] [pid 26861:tid 140425465390848] [remote a]   File "/zzzzzzzzz/python3.5/site-packages/httplib2/__init__.py", line 1072, in _request
[Thu Nov 01 09:50:28.247669 2018] [wsgi:error] [pid 26861:tid 140425465390848] [remote a]     (response, content) = self._conn_request(conn, request_uri, method, body, headers)

[Thu Nov 01 09:50:28.247752 2018] [wsgi:error] [pid 26861:tid 140425465390848] [remote a]     self.send(msg)
[Thu Nov 01 09:50:28.247761 2018] [wsgi:error] [pid 26861:tid 140425465390848] [remote a]   File "/usr/lib/python3.5/http/client.py", line 908, in send
[Thu Nov 01 09:50:28.247768 2018] [wsgi:error] [pid 26861:tid 140425465390848] [remote a]     self.sock.sendall(data)
[Thu Nov 01 09:50:28.247792 2018] [wsgi:error] [pid 26861:tid 140425465390848] [remote a] ConnectionResetError: [Errno 104] Connection reset by peer

Thank you
Lucas





Lukasz Szybalski

unread,
Dec 10, 2018, 5:38:04 PM12/10/18
to pylons-discuss


On Wednesday, November 28, 2018 at 11:11:55 PM UTC-6, Lukasz Szybalski wrote:


On Monday, October 8, 2018 at 12:10:20 PM UTC-5, Michael Merickel wrote:
If you are doing loading of data at "first run of the function" then you have introduced a race condition in your app where unless you do appropriate locking, two threads (most wsgi servers serve a request per thread) may both consider themselves the first run and load the data. The only way to do this without locks is to do things at config-time like I suggested before.


Hello,
How would one handle the following?
We have added the initation of the class and establish connection at config time in
__init__.py
'
Any ideas on how to except
  raven/utils/serializer/manager.py", line 76, in transform return repr(value)

and properly reset the connection in below line?

 config.registry.MY = MYContract()


 Thanks
Lucas

Michael Merickel

unread,
Dec 11, 2018, 11:01:18 AM12/11/18
to Pylons
I think you need to find a connection pool implementation. I'm sure there's some libraries for this but I haven't looked. At the very least, you do not want to share a connection across two threads at the same time. SQLAlchemy's awesome pool impls will ensure the connection is still alive (not reset like you're seeing) before returning it to you, and closing/creating new ones when needed. This is pretty critical to getting connection resources managed successfully imo.

- Michael

--
You received this message because you are subscribed to the Google Groups "pylons-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pylons-discus...@googlegroups.com.
To post to this group, send email to pylons-...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages