Focus? (Re: [sage-finance] Re: Google Protocol Buffers...)

16 views
Skip to first unread message

William Stein

unread,
Jul 9, 2008, 1:08:14 AM7/9/08
to sage-f...@googlegroups.com
On Tue, Jul 8, 2008 at 9:51 PM, Glenn H Tarbox, PhD <gl...@tarbox.org> wrote:
>
> ok, belay all that... clearly there's a lot more going on in the code...
>
> So, looks like we wrap OTClient... let it do whatever thread madness it
> wants, and synchronize the callbacks. Seems that synchronizing the
> callbacks would take the form of reactor.callLater's

Why don't we write a simple opentick client that *works* from Python,
so that I can use it in my SIMUW course, then worry about all
this threading, synchronization, and callback stuff later?
This just all seems like premature optimization being the root
of all evil and spending money we don't have yet. Once we have
a chunck of code that does something useful and is 100% bugfree
-- which should in theory be easy enough to write (I wish), I will
have enough for my students in SIMUW (which is in just over
two weeks!). Then it makes sense to worry about all that
other stuff that your discussing.

It will be very annoying if everyone spins their wheels in circles
trying to do something really complicated, with the net result
being that I have to just sit down and write an opentick interface
myself so that I have something which actually works for
my class.

-- William

>
> the next step would be to take the message format and rewrite the
> code... I doubt it would be that hard but it would be silly to take that
> on now.
>
> The only possible problem, and I doubt that this is gonna be a problem,
> is if there's any blocking on the client side... I doubt there is...
>
> -glenn
>
> On Tue, 2008-07-08 at 21:29 -0700, Glenn H Tarbox, PhD wrote:
>> On Tue, 2008-07-08 at 20:47 -0700, Chris Swierczewski wrote:
>> > Glenn,
>> >
>> > > Once you figure out the fundamentals of getting OpenTick and cython
>> > > integrated, you and I should chat about how to handle the event loop.
>> > > The easiest thing is to use twisted and nail up the necessary file
>> > > descriptors to invoke your code from clients, opentick or a timer (the
>> > > latter likely just for general status checks etc).
>> > >
>> > > I can write a simple wrapper for you when you get to that point.
>> >
>> > I was contemplating that on my way home today. Some guidance would be
>> > nice. I'll begin to review some twisted documentation when that time
>> > comes so I can keep up with you.
>>
>> So, here's the thing. What you almost certainly have is code which
>> expects to block on a pipe during a read. The natural tendency will be
>> to spawn a thread. While that might be the way we go, it may not be
>> optimal because the callback will emerge into python in a thread which
>> isn't the main thread...
>>
>> There are a lot of ways to deal with this. We can implement the
>> callback to queue data for pickup by the main thread... this is easy to
>> do either using standard synchronized python queues... or if we use a
>> "native" thread, using posix synchronization mechanisms. The twisted
>> way is to do a reactor.callLater(0.0,...) with the payload. This is
>> thread safe and when the reactor gets the thread back it'll "do the
>> right thing" in the main thread.
>>
>> But....
>>
>> What we'd really like to do is hand blocks of bytes into something which
>> does as much as it can. if its got a full message, we want it to hand
>> it back to us or execute a callback which, since its the main thread,
>> would be ok. If it needs more bytes (partial message), it just returns
>> having done nothing.
>>
>> Of course, the code then needs to pick up where it left off with
>> additional data... unless the code is written that way, this might get
>> ugly. if its written, for example, as some kind of recursive descent
>> parser (unlikely, but it might be using the stack / program counter as
>> the "implied" state machine) we'll need to think about it some.
>>
>> I should take a look at the code myself and see if this is easily
>> implemented.
>>
>> There is another consideration... if written properly, its not the worst
>> thing in the world to use an actual "native" machine thread to suck from
>> the pipe and handle messages as they come in. Here we'll get multiple
>> cores working for us. and as the code is C/C++, and if we don't screw
>> it up, all we need to do is make sure that the handoff into the upper
>> levels is done correctly.... this is more than possible... but care is
>> necessary. methinks that this code is sufficiently segmented that it
>> might not be a problem.
>>
>> ok, I'm gonna download the latest version now... gotta look...
>>
>> -glenn
>>
>> There would be a fancier method using generators but that's a construct
>> not available in C/C++.
>>
>>
>> >
>> > --
>> > Chris
>> >
>> > > >
> --
> Glenn H. Tarbox, PhD || 206-494-0819 || gl...@tarbox.org
> "Don't worry about people stealing your ideas. If your ideas are any
> good you'll have to ram them down peoples throats" -- Howard Aiken
>
>
> >
>

--
William Stein
Associate Professor of Mathematics
University of Washington
http://wstein.org

Glenn H Tarbox, PhD

unread,
Jul 9, 2008, 1:20:39 AM7/9/08
to sage-f...@googlegroups.com
On Tue, 2008-07-08 at 22:08 -0700, William Stein wrote:
> On Tue, Jul 8, 2008 at 9:51 PM, Glenn H Tarbox, PhD <gl...@tarbox.org> wrote:
> >
> > ok, belay all that... clearly there's a lot more going on in the code...
> >
> > So, looks like we wrap OTClient... let it do whatever thread madness it
> > wants, and synchronize the callbacks. Seems that synchronizing the
> > callbacks would take the form of reactor.callLater's
>
> Why don't we write a simple opentick client that *works* from Python,
> so that I can use it in my SIMUW course, then worry about all
> this threading, synchronization, and callback stuff later?

That was the intent of this email.

> This just all seems like premature optimization being the root
> of all evil and spending money we don't have yet. Once we have
> a chunck of code that does something useful and is 100% bugfree
> -- which should in theory be easy enough to write (I wish), I will
> have enough for my students in SIMUW (which is in just over
> two weeks!).

I think we can have something quickly. I didn't look at the code
carefully enough prior to sending the previous email. It took all of 10
minutes to reevaluate given what I saw.

> Then it makes sense to worry about all that
> other stuff that your discussing.

Yup.

>
> It will be very annoying if everyone spins their wheels in circles
> trying to do something really complicated, with the net result
> being that I have to just sit down and write an opentick interface
> myself so that I have something which actually works for
> my class.

We're taking the route you'd take. I didn't realize that they had a
full up and full blown synchronized multi-threaded client as part of
their infrastructure. I was expecting a few parsing classes that got
nailed together. When I saw what there was, it was clear that wrapping
OTClient in cython was the obvious first step.

BTW, the reason I thought it might be less sophisticated is because they
publish their protocol... I figured their client classes simply
implemented the wire to structure part... as this isn't the case, I
rolled back the transaction which got you concerned.

Essentially, their approach is similar to IB.

On that point, for your class, I might suggest we use the current IB
implementation I have working. It grabs live data, can retrieve
historical data and has the full blown client which shows more market
information than we'll have working with OpenTick in the near term.

If you need more fine grained data for longer periods, we can suck from
OpenTick into a file and get it all prepared before the class starts.
There are applications for that right out of the box.

-glenn

William Stein

unread,
Jul 9, 2008, 1:26:44 AM7/9/08
to sage-f...@googlegroups.com
On Tue, Jul 8, 2008 at 10:20 PM, Glenn H Tarbox, PhD <gl...@tarbox.org> wrote:
>
> On Tue, 2008-07-08 at 22:08 -0700, William Stein wrote:
>> On Tue, Jul 8, 2008 at 9:51 PM, Glenn H Tarbox, PhD <gl...@tarbox.org> wrote:
>> >
>> > ok, belay all that... clearly there's a lot more going on in the code...
>> >
>> > So, looks like we wrap OTClient... let it do whatever thread madness it
>> > wants, and synchronize the callbacks. Seems that synchronizing the
>> > callbacks would take the form of reactor.callLater's
>>
>> Why don't we write a simple opentick client that *works* from Python,
>> so that I can use it in my SIMUW course, then worry about all
>> this threading, synchronization, and callback stuff later?
>
> That was the intent of this email.

Awesome. I strongly agree with you.

:-)

>
> Essentially, their approach is similar to IB.
>
> On that point, for your class, I might suggest we use the current IB
> implementation I have working. It grabs live data, can retrieve
> historical data and has the full blown client which shows more market
> information than we'll have working with OpenTick in the near term.

Chris or Brett -- who volunteers to get this included in Sage asap?
Can either of you pull it off in the next two weeks? Glenn -- maybe
post a link to your code again here with some instructions.

> If you need more fine grained data for longer periods, we can suck from
> OpenTick into a file and get it all prepared before the class starts.

I don't need too much. I just need enough to make the class
interesting.

-- William

Glenn H Tarbox, PhD

unread,
Jul 9, 2008, 1:50:31 AM7/9/08
to sage-f...@googlegroups.com

I think, given the time frame, that I should take the lead on this.
It'll save a week (in no small part due to the fact that I'm twisted
compliant, which is no small feat, and, um, my documentation skills are
also twisted compliant... which isn't a positive statement :-)

>
> > If you need more fine grained data for longer periods, we can suck from
> > OpenTick into a file and get it all prepared before the class starts.
>
> I don't need too much. I just need enough to make the class
> interesting.

Some suggestions:

Black-Scholes - based on geometric brownian... show the math maybe a
little... but run some sims with the brownian motion capabilities you
have in the timeseries class to show that its right...

Black-Scholes is also useful because you get into implied volatility
right away which requires an iterative solution. Simple but begins to
show where things get nasty.

Then, you can discuss why black-scholes isn't particuarly useful other
than a first cut... its really IV which matters because options cost
what they cost...

and you're into the difference between actual volatility and implied
volatility... and you can go into where they're often inversely
correlated. Going into a announcement for example. After the earnings
report, actual volatility goes up whereas IV goes down... actual
volatility goes up because of the price jump caused by disappointment or
elation (mostly disappointment lately) whereas IV goes down because
there's less uncertainty... which is really what IV is all about.

Finally, on options pricing... you can even show that for american
options, black scholes doesn't apply because of the value assigned to
the ability to execute the option on any date up to expiry...
black-scholes only applies to european style options where the exercise
date is fixed.

just thoughts...

-glenn

>
> -- William

Glenn H Tarbox, PhD

unread,
Jul 9, 2008, 2:00:08 AM7/9/08
to sage-f...@googlegroups.com
For the class, it might be entirely unnecessary to use the real-time api
directly. the IB client has the ability to download data into a file
from a live feed. So, you have kinda live and no details to worry about
WRT client-server stuff.

If you think its necessary to show how one would wrote an integrated
system, we can come up with something... but it might be distracting
given the details. The class isn't all that long and none of the hard
part has anything to do with math....

but, I'm up for whatever you think is necessary.

-glenn

Chris Swierczewski

unread,
Jul 9, 2008, 3:03:34 AM7/9/08
to sage-f...@googlegroups.com
William, Glenn, and Brett,

> Why don't we write a simple opentick client that *works* from Python,
> so that I can use it in my SIMUW course, then worry about all
> this threading, synchronization, and callback stuff later?

Although I appreciate all the "big picture" tips, I'm currently
working on a simple, straightforward "opentick client that *works*
from Python". Once all that is taken care of (see below), then I'll
begin worrying about large-scale optimization. (e.g. threading,
synchro, etc.) Since this is my first wrapping / implementation job, I
think the best course of action is to build up step by step instead of
taking lots of time to perfect an underlying structure.

> It will be very annoying if everyone spins their wheels in circles
> trying to do something really complicated, with the net result
> being that I have to just sit down and write an opentick interface
> myself so that I have something which actually works for
> my class.

I really hope that won't be the end result.

> Chris or Brett -- who volunteers to get this included in Sage asap?
> Can either of you pull it off in the next two weeks? Glenn -- maybe
> post a link to your code again here with some instructions.

It's unfortunate that I have classes, so I guess I'll have to halve
the time I spend preparing for them. Along with the current 3:30 -
5:30ish time slot I've been taking to work on this situation, I'll now
take 10:30am - 1:00pm as well. Hopefully I can produce some more
results with that time.

> We're taking the route you'd take. I didn't realize that they had a
> full up and full blown synchronized multi-threaded client as part of
> their infrastructure. I was expecting a few parsing classes that got
> nailed together. When I saw what there was, it was clear that wrapping
> OTClient in cython was the obvious first step.

Wrapping OTClient in Cython is task #1. I'm still figuring out how to
get Cython to recognize the library and / or extern from the header
"OTClient.h". This requires learning some Cython---which I'm indeed
doing at the moment. I was just beginning to figure that out today
when Brett and I were sidetracked by a previous Google Finance problem
William mentioned. Methinks it's fixed now. See Ticket #3621. However,
number one priority tomorrow will be to get the Cython-OTClient.h
communication established.

In two weeks I'll be able to dedicate my entire day to working on
these problems. Until then, I'll try to have something working for the
SIMUW class. My apologies for any inconveniences.

--
Chris

William Stein

unread,
Jul 9, 2008, 3:18:51 AM7/9/08
to sage-f...@googlegroups.com
On Wed, Jul 9, 2008 at 12:03 AM, Chris Swierczewski <cswi...@gmail.com> wrote:
>
> William, Glenn, and Brett,
>
>> Why don't we write a simple opentick client that *works* from Python,
>> so that I can use it in my SIMUW course, then worry about all
>> this threading, synchronization, and callback stuff later?
>
> Although I appreciate all the "big picture" tips, I'm currently
> working on a simple, straightforward "opentick client that *works*
> from Python". Once all that is taken care of (see below), then I'll

Make an opentick client that does *anything* at all as a first
step and I'll be pleased if I can use it without tearing my
hair out. :-)

> begin worrying about large-scale optimization. (e.g. threading,
> synchro, etc.) Since this is my first wrapping / implementation job, I
> think the best course of action is to build up step by step instead of
> taking lots of time to perfect an underlying structure.

How about build something step by step, figure out how it worked,
then redo it with a perfect underlying structure later. That's what
I do. It's especially nice when you get somebody else to do the
last step.

>
>> It will be very annoying if everyone spins their wheels in circles
>> trying to do something really complicated, with the net result
>> being that I have to just sit down and write an opentick interface
>> myself so that I have something which actually works for
>> my class.
>
> I really hope that won't be the end result.
>
>> Chris or Brett -- who volunteers to get this included in Sage asap?
>> Can either of you pull it off in the next two weeks? Glenn -- maybe
>> post a link to your code again here with some instructions.
>
> It's unfortunate that I have classes, so I guess I'll have to halve
> the time I spend preparing for them. Along with the current 3:30 -
> 5:30ish time slot I've been taking to work on this situation, I'll now
> take 10:30am - 1:00pm as well. Hopefully I can produce some more
> results with that time.

As much as I would greatly appreciate it, you shouldn't do that.
Please do devote as much time as you need to your classes.
They will be over soon enough and you'll be able to focus
on sage fulltime then. Thanks for your commitment, but please
don't mess up your last quarter of undergrad classes.

>
>> We're taking the route you'd take. I didn't realize that they had a
>> full up and full blown synchronized multi-threaded client as part of
>> their infrastructure. I was expecting a few parsing classes that got
>> nailed together. When I saw what there was, it was clear that wrapping
>> OTClient in cython was the obvious first step.
>
> Wrapping OTClient in Cython is task #1. I'm still figuring out how to
> get Cython to recognize the library and / or extern from the header
> "OTClient.h". This requires learning some Cython---which I'm indeed
> doing at the moment. I was just beginning to figure that out today
> when Brett and I were sidetracked by a previous Google Finance problem
> William mentioned. Methinks it's fixed now. See Ticket #3621. However,
> number one priority tomorrow will be to get the Cython-OTClient.h
> communication established.

Thanks for doing that, by the way. It's very much appreciated!

Tom Boothby may be able to help you with Cython subtleties
by the way.

> In two weeks I'll be able to dedicate my entire day to working on
> these problems. Until then, I'll try to have something working for the
> SIMUW class. My apologies for any inconveniences.

Well keeping in mind that they are a bunch of 15 and 16 year
olds, we don't need *that* much to keep 'em busy. But some
high-frequency historical data would be very very useful I think.

William

Glenn H Tarbox, PhD

unread,
Jul 9, 2008, 3:56:31 AM7/9/08
to sage-f...@googlegroups.com

as I said previously, we can have high-freq tick data for any number of
instruments tomorrow. There are clients which work "out of the box" for
OT and we can grab historical data.

The effort we were discussing for the OT integration into Sage prior to
the need to support the summer course is enttirely about live data...
Historical data retrieval into a file is, essentially, included in the
OT command line tools built with their distro.

Also, I have reems of live tick data for the past few weeks I've been
collecting during the day for various futures (including vix) into
postgres. Live and tick mean different things to IB but for your
purposes it should be more than sufficient.

I think we need to determine what is necessary for the course. IMHO,
live tick data into Sage might be a distraction as Sage doesn't
currently support exogenous events (will soon...).

Personally, I'd like to step back and think about what to present for
the class. I think this is exactly the kind of thing we need as
baseline features in general for the finance module. This kind of broad
overview is crucial for identifying holes and should "just be there".

-glenn

>
> William

Reply all
Reply to author
Forward
0 new messages