There are just so many IPC modules out there. I'm looking for a solution for developing a new a multi-tier application. The core application will be running on a single computer, so the IPC should be using shared memory (or mmap) and have very short response times. But there will be a tier that will hold application state for clients, and there will be lots of clients. So that tier needs to go to different computers. E.g. the same IPC should also be accessed over TCP/IP. Most messages will be simple data structures, nothing complicated. The ability to run on PyPy would, and also to run on both Windows and Linux would be a plus.
I have seen a stand alone cross platform IPC server before that could serve "channels", and send/receive messages using these channels. But I don't remember its name and now I cannot find it. Can somebody please help?
On Friday, August 31, 2012 9:22:00 PM UTC+2, Laszlo Nagy wrote:
> There are just so many IPC modules out there. I'm looking for a solution
> for developing a new a multi-tier application. The core application will
> be running on a single computer, so the IPC should be using shared
> memory (or mmap) and have very short response times. But there will be a
> tier that will hold application state for clients, and there will be
> lots of clients. So that tier needs to go to different computers. E.g.
> the same IPC should also be accessed over TCP/IP. Most messages will be
> simple data structures, nothing complicated. The ability to run on PyPy
> would, and also to run on both Windows and Linux would be a plus.
> I have seen a stand alone cross platform IPC server before that could
> serve "channels", and send/receive messages using these channels. But I
> don't remember its name and now I cannot find it. Can somebody please help?
> Thanks,
> Laszlo
Hi,
Are you aware and have you considered zeromq (www.zeromq.org)? It does not provide a messaging system, but you could use things like simple strings (json) or more complicated things like Protobuf.
On Friday, August 31, 2012 9:22:00 PM UTC+2, Laszlo Nagy wrote:
> There are just so many IPC modules out there. I'm looking for a solution
> for developing a new a multi-tier application. The core application will
> be running on a single computer, so the IPC should be using shared
> memory (or mmap) and have very short response times. But there will be a
> tier that will hold application state for clients, and there will be
> lots of clients. So that tier needs to go to different computers. E.g.
> the same IPC should also be accessed over TCP/IP. Most messages will be
> simple data structures, nothing complicated. The ability to run on PyPy
> would, and also to run on both Windows and Linux would be a plus.
> I have seen a stand alone cross platform IPC server before that could
> serve "channels", and send/receive messages using these channels. But I
> don't remember its name and now I cannot find it. Can somebody please help?
> Thanks,
> Laszlo
Hi,
Are you aware and have you considered zeromq (www.zeromq.org)? It does not provide a messaging system, but you could use things like simple strings (json) or more complicated things like Protobuf.
Laszlo Nagy <gand...@shopzeus.com> writes:
> application will be running on a single computer, so the IPC should be
> using shared memory (or mmap) and have very short response times.
Zeromq (suggested by someone) is an option since it's pretty fast for
most purposes, but I don't think it uses shared memory. The closest
thing I can think of to what you're asking is MPI, intended for
scientific computation. I don't know of general purpose IPC that uses
it though I've thought it would be interesting. There are also some
shared memory modules around, including POSH for shared objects, but
they don't switch between memory and sockets AFAIK.
Based on your description, maybe what you really want is Erlang, or
something like it for Python. There would be more stuff to do than just
supply an IPC library.
> There are just so many IPC modules out there. I'm looking for a solution > for developing a new a multi-tier application. The core application will > be running on a single computer, so the IPC should be using shared > memory (or mmap) and have very short response times. But there will be a > tier that will hold application state for clients, and there will be > lots of clients. So that tier needs to go to different computers. E.g. > the same IPC should also be accessed over TCP/IP. Most messages will be > simple data structures, nothing complicated. The ability to run on PyPy > would, and also to run on both Windows and Linux would be a plus.
The inter-process transport is currently only implemented on operating systems that provide UNIX domain sockets.
(OFF: Would it be possible to add local IPC support for Windows using mmap()? I have seen others doing it.)
At least, it is functional on Windows, and it excels on Linux. I just need to make transports configureable. Good enough for me.
> The closest
> thing I can think of to what you're asking is MPI, intended for
> scientific computation. I don't know of general purpose IPC that uses
> it though I've thought it would be interesting. There are also some
> shared memory modules around, including POSH for shared objects, but
> they don't switch between memory and sockets AFAIK.
> Based on your description, maybe what you really want is Erlang, or
> something like it for Python. There would be more stuff to do than just
> supply an IPC library.
Yes, although I would really like to do this job in Python. I'm going to make some tests with zeromq. If the speed is good for local inter-process communication, then I'll give it a try.
> There are just so many IPC modules out there. I'm looking for a
> solution for developing a new a multi-tier application. The core
> application will be running on a single computer, so the IPC should
> be using shared memory (or mmap) and have very short response times.
Probably the fastest I/RPC implementation for Python should be
OmniOrbpy:
It's cross-platform, language-independent and standard-(Corba-)
compliant.
> I have seen a stand alone cross platform IPC server before that could > serve "channels", and send/receive messages using these channels. But
> I don't remember its name and now I cannot find it. Can somebody
> please help?
If it's just for "messaging", Spread should be interesting:
On Friday, August 31, 2012 2:22:00 PM UTC-5, Laszlo Nagy wrote:
> There are just so many IPC modules out there. I'm looking for a solution
> for developing a new a multi-tier application. The core application will
> be running on a single computer, so the IPC should be using shared
> memory (or mmap) and have very short response times. But there will be a
> tier that will hold application state for clients, and there will be
> lots of clients. So that tier needs to go to different computers. E.g.
> the same IPC should also be accessed over TCP/IP. Most messages will be
> simple data structures, nothing complicated. The ability to run on PyPy
> would, and also to run on both Windows and Linux would be a plus.
> I have seen a stand alone cross platform IPC server before that could
> serve "channels", and send/receive messages using these channels. But I
> don't remember its name and now I cannot find it. Can somebody please help?
> Thanks,
> Laszlo
Hi Laszlo,
There aren't a lot of ways to create a Python object in an "mmap" buffer. "mmap" is conducive to arrays of arrays. For variable-length structures like strings and lists, you need "dynamic allocation". The C functions "malloc" and "free" allocate memory space, and file creation and deletion routines operate on disk space. However "malloc" doesn't allow you to allocate memory space within memory that's already allocated. Operating systems don't provide that capability, and doing it yourself amounts to creating your own file system. If you did, you still might not be able to use existing libraries like the STL or Python, because one address might refer to different locations in different processes.
One solution is to keep a linked list of free blocks within your "mmap" buffer. It is prone to slow access times and segment fragmentation. Another solution is to create many small files with fixed-length names. The minimum file size on your system might become prohibitive depending on your constraints, since a 4-byte integer could occupy 4096 bytes on disk or more. Or you can serialize the arguments and return values of your functions, and make requests to a central process.
On Friday, August 31, 2012 2:22:00 PM UTC-5, Laszlo Nagy wrote:
> There are just so many IPC modules out there. I'm looking for a solution
> for developing a new a multi-tier application. The core application will
> be running on a single computer, so the IPC should be using shared
> memory (or mmap) and have very short response times. But there will be a
> tier that will hold application state for clients, and there will be
> lots of clients. So that tier needs to go to different computers. E.g.
> the same IPC should also be accessed over TCP/IP. Most messages will be
> simple data structures, nothing complicated. The ability to run on PyPy
> would, and also to run on both Windows and Linux would be a plus.
> I have seen a stand alone cross platform IPC server before that could
> serve "channels", and send/receive messages using these channels. But I
> don't remember its name and now I cannot find it. Can somebody please help?
> Thanks,
> Laszlo
Hi Laszlo,
There aren't a lot of ways to create a Python object in an "mmap" buffer. "mmap" is conducive to arrays of arrays. For variable-length structures like strings and lists, you need "dynamic allocation". The C functions "malloc" and "free" allocate memory space, and file creation and deletion routines operate on disk space. However "malloc" doesn't allow you to allocate memory space within memory that's already allocated. Operating systems don't provide that capability, and doing it yourself amounts to creating your own file system. If you did, you still might not be able to use existing libraries like the STL or Python, because one address might refer to different locations in different processes.
One solution is to keep a linked list of free blocks within your "mmap" buffer. It is prone to slow access times and segment fragmentation. Another solution is to create many small files with fixed-length names. The minimum file size on your system might become prohibitive depending on your constraints, since a 4-byte integer could occupy 4096 bytes on disk or more. Or you can serialize the arguments and return values of your functions, and make requests to a central process.
Laszlo Nagy wrote:
> There are just so many IPC modules out there. I'm looking for a > solution for developing a new a multi-tier application. The core > application will be running on a single computer, so the IPC should be > using shared memory (or mmap) and have very short response times. But > there will be a tier that will hold application state for clients, and > there will be lots of clients. So that tier needs to go to different > computers. E.g. the same IPC should also be accessed over TCP/IP. Most > messages will be simple data structures, nothing complicated. The > ability to run on PyPy would, and also to run on both Windows and > Linux would be a plus.
> I have seen a stand alone cross platform IPC server before that could > serve "channels", and send/receive messages using these channels. But > I don't remember its name and now I cannot find it. Can somebody > please help?
> Thanks,
> Laszlo
http://pypi.python.org/pypi/execnet ? It sends/receives messages through channels, probably not the only one though. It requires only python installed on the remote machines.
I always thought, that the multiprocessing module does NOT use shared
memory (at least not under windows)
My understanding was, that it forks (or whateveri is closest to fork
under windows) and uses sockets and pickle to communicate between the
processes. However perhap s I just misunderstood I never spent time to dive into the internals of multiprocessing.
I would be very interested in a cross platform shared mem solution for
python.
Could you please point me to the right section.
> There aren't a lot of ways to create a Python object in an "mmap" buffer. "mmap" is conducive to arrays of arrays. For variable-length structures like strings and lists, you need "dynamic allocation". The C functions "malloc" and "free" allocate memory space, and file creation and deletion routines operate on disk space. However "malloc" doesn't allow you to allocate memory space within memory that's already allocated. Operating systems don't provide that capability, and doing it yourself amounts to creating your own file system. If you did, you still might not be able to use existing libraries like the STL or Python, because one address might refer to different locations in different processes.
> One solution is to keep a linked list of free blocks within your "mmap" buffer. It is prone to slow access times and segment fragmentation. Another solution is to create many small files with fixed-length names. The minimum file size on your system might become prohibitive depending on your constraints, since a 4-byte integer could occupy 4096 bytes on disk or more. Or you can serialize the arguments and return values of your functions, and make requests to a central process.
I'm not sure about the technical details, but I was said that multiprocessing module uses mmap() under windows. And it is faster than TCP/IP. So I guess the same thing could be used from zmq, under Windows. (It is not a big concern, I plan to operate server on Unix. Some clients might be running on Windows, but they will use TCP/IP.)
> I always thought, that the multiprocessing module does NOT use shared
> memory (at least not under windows)
It uses mmap() under windows. (I'm not an expert, but this is what I was said by others.) I did not know that multiprocessing can be used over TCP/IP. :) Probably I'll use zmq instead, because it has other nice features (auto reconnect, publisher/subscriber, multicast etc.)
> I would be very interested in a cross platform shared mem solution for
> python.
> Could you please point me to the right section.
As far as I know, POSIX compatible shared memory does not exist on Windows. I remember a thread about this on the PostgreSQL mailing list - the Windows version of PostgreSQL somehow emulates shared memory too. I wanted to use shared memory because response times are much faster than TCP/IP.
> It's cross-platform, language-independent and standard-(Corba-)
> compliant.
I don't want to use IDL though. Clients will be written in Python, and it would be a waste of time to write IDL files.
>> I have seen a stand alone cross platform IPC server before that could
>> serve "channels", and send/receive messages using these channels. But
>> I don't remember its name and now I cannot find it. Can somebody
>> please help?
> If it's just for "messaging", Spread should be interesting:
> So, it really depends on whether you are trying to build a parallel > system or distributed system. They are related to each other, but the > implied connotations/goals are different. Parallel programming deals > with increasing computational power by using multiple computers > simultaneously. Distributed programming deals with reliable > (consistent, fault-tolerant and highly available) group of computers.
I don't know the full theory behind distributed programming or parallel programming. ZMQ seems easier to use.
> <snip>
> Note that this difference mainly applies to how the processes are
> themselves are created... How the library wraps shared data is
> possibly different (I've never understood how a "fork" process can
> avoid memory conflicts if it has write access to common virtual memory
> blocks).
Here's an approximate description of fork, at least for the memory
aspects. During a fork, the virtual memory table is copied (that's
descriptors for all mapped and allocated memory) but the memory itself
is NOT. All the new descriptors are labeled "COW" (copy-on-write). As
that process executes, the first time it writes in a particular memory
block, the OS gets a memory fault, which it fixes by allocating a block
of the same size, copying the memory block to the new one, and labeling
it read/write. Subsequent accesses to the same block are normal, with no
trace of the fork remaining.
Now, there are lots of details that this blurs over, but it turns out
that many times the new process doesn't change very much. For example,
all the mappings to the executable and to shared libraries are
theoretically readonly. In fact, they might have also been labeled COW
even for the initial execution of the program. Another place that's
blurry is just what the resolution of this table actually is. There are
at least two levels of tables. The smallest increment on the Pentium
family is 4k.
> On Friday, 7 September 2012 02:25:15 UTC+5:30, Dave Angel wrote:
>> On 09/06/2012 04:33 PM, Dennis Lee Bieber wrote:
>>> <snip>
>>> Note that this difference mainly applies to how the processes are
>>> themselves are created... How the library wraps shared data is
>>> possibly different (I've never understood how a "fork" process can
>>> avoid memory conflicts if it has write access to common virtual memory
>>> blocks).
>> Here's an approximate description of fork, at least for the memory
>> aspects. During a fork, the virtual memory table is copied (that's
>> descriptors for all mapped and allocated memory) but the memory itself
>> is NOT. All the new descriptors are labeled "COW" (copy-on-write). As
>> that process executes, the first time it writes in a particular memory
>> block, the OS gets a memory fault, which it fixes by allocating a block
>> of the same size, copying the memory block to the new one, and labeling
>> it read/write. Subsequent accesses to the same block are normal, with no
>> trace of the fork remaining.
>> Now, there are lots of details that this blurs over, but it turns out
>> that many times the new process doesn't change very much. For example,
>> all the mappings to the executable and to shared libraries are
>> theoretically readonly. In fact, they might have also been labeled COW
>> even for the initial execution of the program. Another place that's
>> blurry is just what the resolution of this table actually is. There are
>> at least two levels of tables. The smallest increment on the Pentium
>> family is 4k.
>> --
>> DaveA
> From my OS development experience, there are two sizes of pages - 4K and 1 byte
No - pages are always 4k (or bigger given superpage support).
You are probably thinking about the granularity bit in descriptor
entries, which is relevant for segmentation. The granularity bit
changes the limit between 1 byte or 4K chunks, as there are only 20 bits
of limit space in a 32bit {L,G}DT entry.
However, in the modern days of paging, segmentation support is only for
legacy compatibility.
On Fri, 2012-08-31 at 21:04 +0200, Laszlo Nagy wrote: > I have seen a stand alone cross platform IPC server before that could > serve "channels", and send/receive messages using these channels. But I > don't remember its name and now I cannot find it. Can somebody please help?
Just having a real and robust message broker is fabulous. It is
tempting at first to want to avoid 'external' components; but
development on top of a fully-featured message bus is extremely
addictive. Working with Rabbit & AMQ is very pleasant and every time i
have a uh-oh-i-need-to-deal-with... moment I discover that
aha-rabbit-can-deal-with-that and back-to-my-application.