TorQ/kdb+ functionality Whats available?

David Smith

unread,

Oct 3, 2016, 6:44:41 AM10/3/16

to Kdb+ TorQ

Hi Kdb+ group,

I am new to kdb+ and I am interested in capturing FX market data (level 1 quotes and any associated trade messages) and persisting that data to a simple database, for analysis and back testing purposes. I also want to use the data in realtime for a simple visual display, showing current price and time weighted average spreads.

I will give more details in bullet point form below:

I estimate that the message rate will be a peak of 10k a second, with an average of 1k per second
The messages can be batched for the purposes of the real time display
The instruments will be standard FX spot instruments. The table definitions at https://a.kx.com/q/tick/sym.q are sufficient initially
I wish the data to be stored with the trade date in mind. I wish any down time associated with the RTD, that i have seen mentioned on other forums to occur during the change of session time of the CME (usually 10/11 GMT)
For back testing I will require data from both the current trade data, and previous trade dates
I have a separate feed handler (already working) that will provide the market data tick by tick
I wish to minimise the hardware requirements (in terms of disk usage and memory usage)
I wish to calculate a simple 5 and 10 minute moving average spread for display purposes for each instrument and have these persisted to disk in the same manner as the related quotes
I wish to use a HTML5 based application to display analytics

Is this something that your torQ framework could provide?

If so how would I go about setting this up? Do you have any guides?
What are the pro's and cons of such a set up?
Does the framework only support the 32bit version of kdb+ or can it be upgraded to the full 64bit version when desired?
What are the limitations of the framework in terms of latency/throughput and how can I set up the system to avoid these?
What interfaces exist for HTML5 developers? How would you recommend this to be set up?

Finally, other than this group what other forums exist for learning the kdb+ language, and specifically its interaction with c/c++?

Much appreciated,

Dave

Regards,

FXTrader

pressjonny0

unread,

Oct 5, 2016, 4:55:36 AM10/5/16

to Kdb+ TorQ

Hi Dave

Welcome! kdb+ is a good place to be! (you should also sign up to the personal kdb+ developers group if you haven't already)

What you are trying to achieve sounds like a pretty common use case, and can be done with TorQ. Best place to start is to download the TorQ base package along with the TorQ Finance Starter Pack. The starter pack is a fairly solid initial stab at a market data capture system. You should be able to then modify the schema, and switch in your feed (if you need help converting the output of your feed to something kdb+ understands then give us a shout). We put together a short video on setting up TorQ : http://www.aquaq.co.uk/q/torq-kdb-data-capture-in-two-minutes/. (we know it's a bit cheesy :-) )

In terms of minimising disk and memory usage there are a few options.

kdb+ will use memory to store data and also to process queries (intermediate result sets). Queries can sometimes be restructured to reduce memory, usually with some cost (e.g. execution time or code complexity). In a standard data capture system there is usually a RDB (real time database) and HDB (historic database). The RDB is usually the current day's data in memory, the HDB is usually the everything prior to today on disk (you can change all this though). The main user of memory is usually the RDB. TorQ allows you to easily specify (in a config file) which tables (and which instruments for those tables if required) are stored in it. Because the RDB in TorQ isn't responsible for writing data to disk at end-of-day, data that is required historically for analysis purposes but is not required intraday can still be captured intraday using the same set up (some examples we have seen of this is high volume order data, which quants analyse historically but isn't used intraday).

TorQ also has some features for minimising memory usage and tickerplant back ups around intraday. A standard data capture has the following processes:

Tickerplant : captures data, writes it to a log, publishes to consumers (the same as tick.q from kdb+ tick)

RDB : stores data required intraday (similar to r.q in kdb+ tick, but a lot of extensions)

WDB : periodically writes the intraday data to disk (similar to w.q, but a lot of extensions)

Sort : a separate, optional process which is used to sort or merge the data after end of day. It is invoked by the WDB. We use a separate process to avoid tickerplant back ups (increased memory usage) in 24 hour markets such as FX

Sort Slave: a separate, optional process which can be used to parallelise the end of day sort/merge process.

You can tune the RDB and WDB to minimise memory. We've written some blogs on it:

Parallel sort: http://www.aquaq.co.uk/q/end-of-day-parallel-sorting-in-torq/

Lower end-of-day memory usage: http://www.aquaq.co.uk/q/optional-write-down-method-added-to-wdb-process-in-torq-2-3/

To minimise disk:

1. make sure you use compression. The standard TorQ set up compresses data each day, it also has a separate process that can be used to run through databases and compress all within, which is driven by a config file (https://github.com/AquaQAnalytics/TorQ/blob/master/config/compressionconfig.csv). Can be used to specify the compression settings down to the column level (each can be done differently). You can also specify e.g. "only compress data after it is X days old".

2. Periodically remove or down sample data (some data sets only have value for a certain period, and after that period an aggregate or down sampled data set will suffice)

3. store less data / make sure you schema is well normalised

Latency V Throughput: there are a few things that can be tuned. The TP can be run in a batch mode (timer) which will increase both latency and throughput. WDB processes can be replicated (e.g. different WDBs for different subsets of data) and data merged to the same HDB at end-of-day. Don't send all the table updates to the RDB if they aren't required. At some point, when the volumes get bigger, you can scale horizontally (i.e. multiple separate data capture setups capturing different sets of data)

Tickerplant roll time: We have done TorQ set ups for FX customers with a differing EOD time. It requires a modification to the tickerplant for its end-of-day check. It also requires the users of the data to be comfortable with the concept of "date" in the HDB, i.e. date becomes "trading date" and does not align with a GMT date or local time date (usually). We will see if we can incorporate these features into the TorQ tickerplant.

5/10 min moving average: yes, you can do that, you need to write a process to connect to the TP and subscribe to data, then calculate the spread values as the ticks arrive.

32bit V 64bit: TorQ runs on 32bit (if you happen to be lucky enough to have downloaded one of the commercial versions), though memory limitations need to be considered (again the RDB is the issue here... everything else can be run to store a lot of data in a small memory footprint). If you need to get a licence and move to 64 bit the upgrade will be seamless.

HTML5: kdb+ supports websockets so you can write HTML5 screens which talk directly to the database. The monitor process in TorQ is a (very) basic monitoring process which has an example HTML5 front end. We added some HTML utilities to TorQ including some stuff to allow you to do pub/sub for HTML clients - https://github.com/AquaQAnalytics/TorQ/blob/master/code/common/html.q

(we also did a blog on html5 though it may be a little out of date now: http://www.aquaq.co.uk/q/kdb-websockets/)

c/c++: there are resources available on code.kx.com. We also have a little feed handler tutorial which might be of use: https://github.com/AquaQAnalytics/kdb-feedhandler-tutorial

Anything else you need a hand with, please shout!

Thanks

Jonny

David Smith

unread,

Oct 7, 2016, 7:30:18 AM10/7/16

to AquaQ TorQ

Hi,

Thank you.

That was much more info than I expected.

Is the TorQ frame work supported in any capacity and can one make/request any new features or are the code files/framework completely free?

Otherwise I shall look though this and I may ask some more questions (off line) over the weekend.

Regards,

Dave

Jonny Press

unread,

Oct 7, 2016, 8:40:20 AM10/7/16

to David Smith, AquaQ TorQ

Hi David

No problem. Yes the framework is free (on github:https://github.com/AquaQAnalytics/TorQ)

You can make changes or request changes. We often add stuff that people have suggested. If you are making a change can you do it as a pull request please? https://help.github.com/articles/about-pull-requests/

We do (paid) support and development for some customers. If you want to discuss that can you email me directly please?

Thanks

Jonny

--
You received this message because you are subscribed to the Google Groups "AquaQ TorQ" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kdbtorq+u...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/kdbtorq/1a9f8d4a-3526-4166-b927-7bc72687504c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

This e-mail message is intended to be received only by persons entitled to receive the confidential information it may contain. E-mail messages to clients of AquaQ Analytics may contain information that is confidential and legally privileged. Please do not read, copy, forward, or store this message unless you are an intended recipient of it. If you have received this message in error, please forward it to the sender and delete it completely from your computer system

David Hall

unread,

Oct 7, 2016, 7:31:59 PM10/7/16

to AquaQ TorQ

Hi David,

We use Kdb+ for capture of FX tick data and set TZ to GMT+2 or 3 ( depending on US/Eastern DST) to align 00:00 of our host with 5pm NY time. By default, the the Kdb+tick EOD happens at 00:00. This aligns our EOD with the FX global EOD quite nicely.

AquaQ TorQ

unread,

Oct 12, 2016, 3:37:09 AM10/12/16

to AquaQ TorQ

Thanks David (H)

We are currently adding the tickerplant to the standard TorQ package. We are making a few changes (discussion in this thread: https://groups.google.com/forum/#!topic/kdbtorq/bcwLed1zuFk). We'll add in proper timezone handling. I think we need the ability to specify

- the timezone that the TP timestamps in

- the time and timezone that the TP saves down in

e.g. FX market - timestamp in GMT, rollover at 5pm NYC time.

It probably adds some complication downstream because the current data capture "date" is not necessarily related to the local time or GMT date. But it should be do-able.

I don't think it will add in any overhead in the TP as long as we calculate offsets from GMT once each day (at startup/rollover). The TP currently uses local time (.z.P) and it's quicker to do arithmetic on .z.p than use .z.P (I'm not sure whether that's system specific though)

q)\t:1000000 .z.p

132

q)\t:1000000 .z.P

607

q)\t:1000000 .z.p + 0D02

482

Thanks

Jonny

Reply all

Reply to author

Forward