I've done quite a bit of reading and checking on this since yesterday and my conclusion is we should embrace Google Protocol Buffers (PB), perhaps in general, but certainly for the time being with the OpenTick and IB integration efforts.
There are 3 types of interactions with the server.
1. Conventional RPC with request/response. Essentially, the call returns "immediately" from the server's perspective. The client can block or implement asynchronous callbacks. 2. RPC with finite response - when the server itself needs to wait for data from the IB application (because it needs to call IB over the wan) the RPC return should be an id to be used to associate callback based responses with the request. The client will need to implement an API to be called back with the data. Sometimes its a single callback, other times its more involved (e.g. search responses, portfolio updates etc) 3. RPC with infinite response - infinite might be a bit larger than I mean, but here I'm talking about things like market data where we're registering a general callback for the data we're "subscribing" to.
IB seems more straightforward so we'll start there. For the most part, I think we can just map our api to IB's java api for EWrapper and EClientSocket. Support objects like Contract, ContractLeg etc. can likely be modeled using data structures. Convenience methods we provide for specific languages to help construct these objects is a separate issue.
The OpenTick architecture is a bit more complicated in that we have a lot of options and cython. The good news is that the api is less complicated than IB's because we're just grabbing market data, not executing trades or getting related information.
With OpenTick, nothing has changed... yet. We should wrap the API and get the data into cython / python. However, with OpenTick, performance is critical (particularly if we ever start doing things with Xasax). At that stage, we might want to extend the lower layers to provide a PB transport to remote clients. This would mean pure C/C++ from the source to the final consumer. For now, this isn't critical, but if we start really suckin' from the pipe, it could make a huge difference.
We should give one look-see at the Java api for OpenTick although I doubt it makes sense given C++ and cython... but we should look to make sure.
-glenn
-- Glenn H. Tarbox, PhD || 206-494-0819 || gl...@tarbox.org "Don't worry about people stealing your ideas. If your ideas are any good you'll have to ram them down peoples throats" -- Howard Aiken
Concerning wrapping the opentick APIs, methinks I can get started on that today. I'll start with finance.Stock as a framework and move from there. Should be quite natural using cython.
I read a bit about those Google APIs. Not sure where exactly they would be implemented into the opentick situation.
-- Chris Swierczewski cswie...@gmail.com mobile: 253 2233721
On Jul 8, 2008, at 9:46 AM, "Glenn H Tarbox, PhD" <gl...@tarbox.org> wrote:
> I've done quite a bit of reading and checking on this since yesterday > and my conclusion is we should embrace Google Protocol Buffers (PB), > perhaps in general, but certainly for the time being with the OpenTick > and IB integration efforts.
> There are 3 types of interactions with the server.
> 1. Conventional RPC with request/response. Essentially, the call > returns "immediately" from the server's perspective. The > client > can block or implement asynchronous callbacks. > 2. RPC with finite response - when the server itself needs to wait > for data from the IB application (because it needs to call IB > over the wan) the RPC return should be an id to be used to > associate callback based responses with the request. The > client > will need to implement an API to be called back with the data. > Sometimes its a single callback, other times its more involved > (e.g. search responses, portfolio updates etc) > 3. RPC with infinite response - infinite might be a bit larger > than > I mean, but here I'm talking about things like market data > where > we're registering a general callback for the data we're > "subscribing" to.
> IB seems more straightforward so we'll start there. For the most > part, > I think we can just map our api to IB's java api for EWrapper and > EClientSocket. Support objects like Contract, ContractLeg etc. can > likely be modeled using data structures. Convenience methods we > provide > for specific languages to help construct these objects is a separate > issue.
> The OpenTick architecture is a bit more complicated in that we have a > lot of options and cython. The good news is that the api is less > complicated than IB's because we're just grabbing market data, not > executing trades or getting related information.
> With OpenTick, nothing has changed... yet. We should wrap the API and > get the data into cython / python. However, with OpenTick, > performance > is critical (particularly if we ever start doing things with > Xasax). At > that stage, we might want to extend the lower layers to provide a PB > transport to remote clients. This would mean pure C/C++ from the > source > to the final consumer. For now, this isn't critical, but if we start > really suckin' from the pipe, it could make a huge difference.
> We should give one look-see at the Java api for OpenTick although I > doubt it makes sense given C++ and cython... but we should look to > make > sure.
> -glenn
> -- > Glenn H. Tarbox, PhD || 206-494-0819 || gl...@tarbox.org > "Don't worry about people stealing your ideas. If your ideas are any > good you'll have to ram them down peoples throats" -- Howard Aiken
On Tue, 2008-07-08 at 10:19 -0700, Chris Swierczewski wrote: > Concerning wrapping the opentick APIs, methinks I can get started on > that today. I'll start with finance.Stock as a framework and move from > there. Should be quite natural using cython.
> I read a bit about those Google APIs. Not sure where exactly they > would be implemented into the opentick situation.
That'll come later. All I was saying is that there may be clients which need to suck from the pipe at a high rate. Since we can generate the transport fairly simply into C++, we might want to extend the server (later) to support no-python routing... meaning in from OpenTick C++ directly out to PB using generated C++...
So, for OpenTick, its do nothing... just something to keep in mind.
IB should probably go VFR direct to PB... its gonna make things easier and perform better than XML-RPC.
> On Jul 8, 2008, at 9:46 AM, "Glenn H Tarbox, PhD" <gl...@tarbox.org> > wrote:
> > All,
> > I've done quite a bit of reading and checking on this since yesterday > > and my conclusion is we should embrace Google Protocol Buffers (PB), > > perhaps in general, but certainly for the time being with the OpenTick > > and IB integration efforts.
> > There are 3 types of interactions with the server.
> > 1. Conventional RPC with request/response. Essentially, the call > > returns "immediately" from the server's perspective. The > > client > > can block or implement asynchronous callbacks. > > 2. RPC with finite response - when the server itself needs to wait > > for data from the IB application (because it needs to call IB > > over the wan) the RPC return should be an id to be used to > > associate callback based responses with the request. The > > client > > will need to implement an API to be called back with the data. > > Sometimes its a single callback, other times its more involved > > (e.g. search responses, portfolio updates etc) > > 3. RPC with infinite response - infinite might be a bit larger > > than > > I mean, but here I'm talking about things like market data > > where > > we're registering a general callback for the data we're > > "subscribing" to.
> > IB seems more straightforward so we'll start there. For the most > > part, > > I think we can just map our api to IB's java api for EWrapper and > > EClientSocket. Support objects like Contract, ContractLeg etc. can > > likely be modeled using data structures. Convenience methods we > > provide > > for specific languages to help construct these objects is a separate > > issue.
> > The OpenTick architecture is a bit more complicated in that we have a > > lot of options and cython. The good news is that the api is less > > complicated than IB's because we're just grabbing market data, not > > executing trades or getting related information.
> > With OpenTick, nothing has changed... yet. We should wrap the API and > > get the data into cython / python. However, with OpenTick, > > performance > > is critical (particularly if we ever start doing things with > > Xasax). At > > that stage, we might want to extend the lower layers to provide a PB > > transport to remote clients. This would mean pure C/C++ from the > > source > > to the final consumer. For now, this isn't critical, but if we start > > really suckin' from the pipe, it could make a huge difference.
> > We should give one look-see at the Java api for OpenTick although I > > doubt it makes sense given C++ and cython... but we should look to > > make > > sure.
> > -glenn
> > -- > > Glenn H. Tarbox, PhD || 206-494-0819 || gl...@tarbox.org > > "Don't worry about people stealing your ideas. If your ideas are any > > good you'll have to ram them down peoples throats" -- Howard Aiken
-- Glenn H. Tarbox, PhD || 206-494-0819 || gl...@tarbox.org "Don't worry about people stealing your ideas. If your ideas are any good you'll have to ram them down peoples throats" -- Howard Aiken
I like this approach for IB and I think I'm going to start doing it this way. The problem however, is that EWrapper is not a class, it's an interface, so I think we'll have to write our own sage-specific implementing class but I think this is what we've been talking about all along. I also don't think we have send the the support classes, we can just use the PB data structures. I'm going to be looking into doing some of this today as well as reading over a lot of the Google PB RPC API.
> I've done quite a bit of reading and checking on this since yesterday > and my conclusion is we should embrace Google Protocol Buffers (PB), > perhaps in general, but certainly for the time being with the OpenTick > and IB integration efforts.
> There are 3 types of interactions with the server.
> 1. Conventional RPC with request/response. Essentially, the call > returns "immediately" from the server's perspective. The client > can block or implement asynchronous callbacks. > 2. RPC with finite response - when the server itself needs to wait > for data from the IB application (because it needs to call IB > over the wan) the RPC return should be an id to be used to > associate callback based responses with the request. The client > will need to implement an API to be called back with the data. > Sometimes its a single callback, other times its more involved > (e.g. search responses, portfolio updates etc) > 3. RPC with infinite response - infinite might be a bit larger than > I mean, but here I'm talking about things like market data where > we're registering a general callback for the data we're > "subscribing" to.
> IB seems more straightforward so we'll start there. For the most part, > I think we can just map our api to IB's java api for EWrapper and > EClientSocket. Support objects like Contract, ContractLeg etc. can > likely be modeled using data structures. Convenience methods we provide > for specific languages to help construct these objects is a separate > issue.
> The OpenTick architecture is a bit more complicated in that we have a > lot of options and cython. The good news is that the api is less > complicated than IB's because we're just grabbing market data, not > executing trades or getting related information.
> With OpenTick, nothing has changed... yet. We should wrap the API and > get the data into cython / python. However, with OpenTick, performance > is critical (particularly if we ever start doing things with Xasax). At > that stage, we might want to extend the lower layers to provide a PB > transport to remote clients. This would mean pure C/C++ from the source > to the final consumer. For now, this isn't critical, but if we start > really suckin' from the pipe, it could make a huge difference.
> We should give one look-see at the Java api for OpenTick although I > doubt it makes sense given C++ and cython... but we should look to make > sure.
On Tue, 2008-07-08 at 13:48 -0700, Brett Nakashima wrote: > I like this approach for IB and I think I'm going to start doing it this > way. The problem however, is that EWrapper is not a class, it's an > interface, so I think we'll have to write our own sage-specific > implementing class but I think this is what we've been talking about all > along.
right. The interface needs to be implemented as an rpc to the actual client... another case of the inverted client-server model... in this case, for callbacks, the client has the server api for cases 1) and 2) below
For ESocketClient (i think that's the name) its the inverse... but this is all entirely consistent with the current IB api programming strategy. We're simply using the "over the wire" protocol to do our language boundary mapping as well.
The only logic outside the conventional API is to support multiple clients. The nailup typically involves making a call to the "root" object inside the server you're writing and passing in the return server reference as a parameter. This approach is superior to "inferring" the existence of a particular instance on the client from the address,port alone. How we decide to actually implement this is open but the general strategy is the (host,port,objectid) distributed object remote reference scheme (same as CORBA, foolscap etc)
So, when a client registers for something which is going to get a callback, the server should return, as a return value, an integer or some other id which the client (with its server object) can use to match to which callback this incoming "response" is related.
Your code needs to maintain a reference for each callback from IB to which client gets the information. My recommendation is to simply increment an integer with each new request... then the value of the integer uniquely identifies the remote object which is to get the response.
it gets tricky if we want to handle garbage collection etc... but we don't need to worry that up front. Lets minimize the work on the Java side for now and we'll fix it after we've done it wrong a few times.
> I also don't think we have send the the support classes, we can > just use the PB data structures. I'm going to be looking into doing > some of this today as well as reading over a lot of the Google PB RPC API.
right... my point was that the support classes need to be implemented as a library for each language supported (should we choose to do so). We'll probably do something for python... but that's another task entirely....
> > I've done quite a bit of reading and checking on this since yesterday > > and my conclusion is we should embrace Google Protocol Buffers (PB), > > perhaps in general, but certainly for the time being with the OpenTick > > and IB integration efforts.
> > There are 3 types of interactions with the server.
> > 1. Conventional RPC with request/response. Essentially, the call > > returns "immediately" from the server's perspective. The client > > can block or implement asynchronous callbacks. > > 2. RPC with finite response - when the server itself needs to wait > > for data from the IB application (because it needs to call IB > > over the wan) the RPC return should be an id to be used to > > associate callback based responses with the request. The client > > will need to implement an API to be called back with the data. > > Sometimes its a single callback, other times its more involved > > (e.g. search responses, portfolio updates etc) > > 3. RPC with infinite response - infinite might be a bit larger than > > I mean, but here I'm talking about things like market data where > > we're registering a general callback for the data we're > > "subscribing" to.
> > IB seems more straightforward so we'll start there. For the most part, > > I think we can just map our api to IB's java api for EWrapper and > > EClientSocket. Support objects like Contract, ContractLeg etc. can > > likely be modeled using data structures. Convenience methods we provide > > for specific languages to help construct these objects is a separate > > issue.
> > The OpenTick architecture is a bit more complicated in that we have a > > lot of options and cython. The good news is that the api is less > > complicated than IB's because we're just grabbing market data, not > > executing trades or getting related information.
> > With OpenTick, nothing has changed... yet. We should wrap the API and > > get the data into cython / python. However, with OpenTick, performance > > is critical (particularly if we ever start doing things with Xasax). At > > that stage, we might want to extend the lower layers to provide a PB > > transport to remote clients. This would mean pure C/C++ from the source > > to the final consumer. For now, this isn't critical, but if we start > > really suckin' from the pipe, it could make a huge difference.
> > We should give one look-see at the Java api for OpenTick although I > > doubt it makes sense given C++ and cython... but we should look to make > > sure.
> > -glenn
-- Glenn H. Tarbox, PhD || 206-494-0819 || gl...@tarbox.org "Don't worry about people stealing your ideas. If your ideas are any good you'll have to ram them down peoples throats" -- Howard Aiken
Once you figure out the fundamentals of getting OpenTick and cython integrated, you and I should chat about how to handle the event loop. The easiest thing is to use twisted and nail up the necessary file descriptors to invoke your code from clients, opentick or a timer (the latter likely just for general status checks etc).
I can write a simple wrapper for you when you get to that point.
On Tue, 2008-07-08 at 10:28 -0700, Glenn H Tarbox, PhD wrote: > On Tue, 2008-07-08 at 10:19 -0700, Chris Swierczewski wrote: > > Concerning wrapping the opentick APIs, methinks I can get started on > > that today. I'll start with finance.Stock as a framework and move from > > there. Should be quite natural using cython.
> > I read a bit about those Google APIs. Not sure where exactly they > > would be implemented into the opentick situation.
> That'll come later. All I was saying is that there may be clients which > need to suck from the pipe at a high rate. Since we can generate the > transport fairly simply into C++, we might want to extend the server > (later) to support no-python routing... meaning in from OpenTick C++ > directly out to PB using generated C++...
> So, for OpenTick, its do nothing... just something to keep in mind.
> IB should probably go VFR direct to PB... its gonna make things easier > and perform better than XML-RPC.
> > On Jul 8, 2008, at 9:46 AM, "Glenn H Tarbox, PhD" <gl...@tarbox.org> > > wrote:
> > > All,
> > > I've done quite a bit of reading and checking on this since yesterday > > > and my conclusion is we should embrace Google Protocol Buffers (PB), > > > perhaps in general, but certainly for the time being with the OpenTick > > > and IB integration efforts.
> > > There are 3 types of interactions with the server.
> > > 1. Conventional RPC with request/response. Essentially, the call > > > returns "immediately" from the server's perspective. The > > > client > > > can block or implement asynchronous callbacks. > > > 2. RPC with finite response - when the server itself needs to wait > > > for data from the IB application (because it needs to call IB > > > over the wan) the RPC return should be an id to be used to > > > associate callback based responses with the request. The > > > client > > > will need to implement an API to be called back with the data. > > > Sometimes its a single callback, other times its more involved > > > (e.g. search responses, portfolio updates etc) > > > 3. RPC with infinite response - infinite might be a bit larger > > > than > > > I mean, but here I'm talking about things like market data > > > where > > > we're registering a general callback for the data we're > > > "subscribing" to.
> > > IB seems more straightforward so we'll start there. For the most > > > part, > > > I think we can just map our api to IB's java api for EWrapper and > > > EClientSocket. Support objects like Contract, ContractLeg etc. can > > > likely be modeled using data structures. Convenience methods we > > > provide > > > for specific languages to help construct these objects is a separate > > > issue.
> > > The OpenTick architecture is a bit more complicated in that we have a > > > lot of options and cython. The good news is that the api is less > > > complicated than IB's because we're just grabbing market data, not > > > executing trades or getting related information.
> > > With OpenTick, nothing has changed... yet. We should wrap the API and > > > get the data into cython / python. However, with OpenTick, > > > performance > > > is critical (particularly if we ever start doing things with > > > Xasax). At > > > that stage, we might want to extend the lower layers to provide a PB > > > transport to remote clients. This would mean pure C/C++ from the > > > source > > > to the final consumer. For now, this isn't critical, but if we start > > > really suckin' from the pipe, it could make a huge difference.
> > > We should give one look-see at the Java api for OpenTick although I > > > doubt it makes sense given C++ and cython... but we should look to > > > make > > > sure.
> > > -glenn
> > > -- > > > Glenn H. Tarbox, PhD || 206-494-0819 || gl...@tarbox.org > > > "Don't worry about people stealing your ideas. If your ideas are any > > > good you'll have to ram them down peoples throats" -- Howard Aiken
-- Glenn H. Tarbox, PhD || 206-494-0819 || gl...@tarbox.org "Don't worry about people stealing your ideas. If your ideas are any good you'll have to ram them down peoples throats" -- Howard Aiken
> Once you figure out the fundamentals of getting OpenTick and cython > integrated, you and I should chat about how to handle the event loop. > The easiest thing is to use twisted and nail up the necessary file > descriptors to invoke your code from clients, opentick or a timer (the > latter likely just for general status checks etc).
> I can write a simple wrapper for you when you get to that point.
I was contemplating that on my way home today. Some guidance would be nice. I'll begin to review some twisted documentation when that time comes so I can keep up with you.
On Tue, 2008-07-08 at 20:47 -0700, Chris Swierczewski wrote: > Glenn,
> > Once you figure out the fundamentals of getting OpenTick and cython > > integrated, you and I should chat about how to handle the event loop. > > The easiest thing is to use twisted and nail up the necessary file > > descriptors to invoke your code from clients, opentick or a timer (the > > latter likely just for general status checks etc).
> > I can write a simple wrapper for you when you get to that point.
> I was contemplating that on my way home today. Some guidance would be > nice. I'll begin to review some twisted documentation when that time > comes so I can keep up with you.
So, here's the thing. What you almost certainly have is code which expects to block on a pipe during a read. The natural tendency will be to spawn a thread. While that might be the way we go, it may not be optimal because the callback will emerge into python in a thread which isn't the main thread...
There are a lot of ways to deal with this. We can implement the callback to queue data for pickup by the main thread... this is easy to do either using standard synchronized python queues... or if we use a "native" thread, using posix synchronization mechanisms. The twisted way is to do a reactor.callLater(0.0,...) with the payload. This is thread safe and when the reactor gets the thread back it'll "do the right thing" in the main thread.
But....
What we'd really like to do is hand blocks of bytes into something which does as much as it can. if its got a full message, we want it to hand it back to us or execute a callback which, since its the main thread, would be ok. If it needs more bytes (partial message), it just returns having done nothing.
Of course, the code then needs to pick up where it left off with additional data... unless the code is written that way, this might get ugly. if its written, for example, as some kind of recursive descent parser (unlikely, but it might be using the stack / program counter as the "implied" state machine) we'll need to think about it some.
I should take a look at the code myself and see if this is easily implemented.
There is another consideration... if written properly, its not the worst thing in the world to use an actual "native" machine thread to suck from the pipe and handle messages as they come in. Here we'll get multiple cores working for us. and as the code is C/C++, and if we don't screw it up, all we need to do is make sure that the handoff into the upper levels is done correctly.... this is more than possible... but care is necessary. methinks that this code is sufficiently segmented that it might not be a problem.
ok, I'm gonna download the latest version now... gotta look...
-glenn
There would be a fancier method using generators but that's a construct not available in C/C++.
> -- > Chris
-- Glenn H. Tarbox, PhD || 206-494-0819 || gl...@tarbox.org "Don't worry about people stealing your ideas. If your ideas are any good you'll have to ram them down peoples throats" -- Howard Aiken
ok, belay all that... clearly there's a lot more going on in the code...
So, looks like we wrap OTClient... let it do whatever thread madness it wants, and synchronize the callbacks. Seems that synchronizing the callbacks would take the form of reactor.callLater's
the next step would be to take the message format and rewrite the code... I doubt it would be that hard but it would be silly to take that on now.
The only possible problem, and I doubt that this is gonna be a problem, is if there's any blocking on the client side... I doubt there is...
On Tue, 2008-07-08 at 21:29 -0700, Glenn H Tarbox, PhD wrote: > On Tue, 2008-07-08 at 20:47 -0700, Chris Swierczewski wrote: > > Glenn,
> > > Once you figure out the fundamentals of getting OpenTick and cython > > > integrated, you and I should chat about how to handle the event loop. > > > The easiest thing is to use twisted and nail up the necessary file > > > descriptors to invoke your code from clients, opentick or a timer (the > > > latter likely just for general status checks etc).
> > > I can write a simple wrapper for you when you get to that point.
> > I was contemplating that on my way home today. Some guidance would be > > nice. I'll begin to review some twisted documentation when that time > > comes so I can keep up with you.
> So, here's the thing. What you almost certainly have is code which > expects to block on a pipe during a read. The natural tendency will be > to spawn a thread. While that might be the way we go, it may not be > optimal because the callback will emerge into python in a thread which > isn't the main thread...
> There are a lot of ways to deal with this. We can implement the > callback to queue data for pickup by the main thread... this is easy to > do either using standard synchronized python queues... or if we use a > "native" thread, using posix synchronization mechanisms. The twisted > way is to do a reactor.callLater(0.0,...) with the payload. This is > thread safe and when the reactor gets the thread back it'll "do the > right thing" in the main thread.
> But....
> What we'd really like to do is hand blocks of bytes into something which > does as much as it can. if its got a full message, we want it to hand > it back to us or execute a callback which, since its the main thread, > would be ok. If it needs more bytes (partial message), it just returns > having done nothing.
> Of course, the code then needs to pick up where it left off with > additional data... unless the code is written that way, this might get > ugly. if its written, for example, as some kind of recursive descent > parser (unlikely, but it might be using the stack / program counter as > the "implied" state machine) we'll need to think about it some.
> I should take a look at the code myself and see if this is easily > implemented.
> There is another consideration... if written properly, its not the worst > thing in the world to use an actual "native" machine thread to suck from > the pipe and handle messages as they come in. Here we'll get multiple > cores working for us. and as the code is C/C++, and if we don't screw > it up, all we need to do is make sure that the handoff into the upper > levels is done correctly.... this is more than possible... but care is > necessary. methinks that this code is sufficiently segmented that it > might not be a problem.
> ok, I'm gonna download the latest version now... gotta look...
> -glenn
> There would be a fancier method using generators but that's a construct > not available in C/C++.
> > -- > > Chris
-- Glenn H. Tarbox, PhD || 206-494-0819 || gl...@tarbox.org "Don't worry about people stealing your ideas. If your ideas are any good you'll have to ram them down peoples throats" -- Howard Aiken