updating a kbd table from C (shared lib)

352 views
Skip to first unread message

Tiago Rodrigues

unread,
May 25, 2015, 9:23:55 AM5/25/15
to personal...@googlegroups.com
Hi,

Ā  I'm extending some functionality of Kdb with a shared library which subscribes to quote feeds from within kdb (instead of using an external feed handler).
Ā  This means that the event processing is being done in real time from inside the shared library, and I'd like to update an existing table in kbd each time a new quote arrives. Is this even possible without callbacks or IPC?

Thanks,

Tiago Rodrigues

Tiago Rodrigues

unread,
May 25, 2015, 1:00:36 PM5/25/15
to personal...@googlegroups.com
Hi,

Ā  just found the answer to my question on kov'sĀ post 'Question about callbacks from C-lib to q via C-function sd1()'.
Ā  I can use the following, with handle 0, for in process requests to kdb:

Ā  Ā  Ā K result=k(0,"0N!",ki(42),(K)0);


Best regards,


Tiago Rodrigues

Tiago Rodrigues

unread,
May 26, 2015, 8:35:45 AM5/26/15
to personal...@googlegroups.com
Well, it seems I still have a problem after all. I can run queries in q using the k(0, ...), but since I'm calling from a slave thread I only have read-only access to kdb variables.
Any hints on how can I manipulate a kdb variable from inside a thread in the shared lib?

Many thanks,

Tiago Rodrigues

Charles Skelton

unread,
May 26, 2015, 8:54:41 AM5/26/15
to personal...@googlegroups.com
There are a few key points to adhere to:

use k(0,..) from the kdb+ main thread only.
use sd1 to register a callback from the kdb+ main thread.
free objects in the thread in which they were allocated.
do not share objects between threads.
to pass objects between threads, use serialization.
internalize strings (via ss or ks) in the main thread only, or call setm(1) to allow internalization from other threads.

Some may choose to ignore the above claiming that it seems to work ok anyway ;-) But they may be setting themselves up for tracking down nasty race conditions with random crashes sometime later.

Hope this helps,
Charlie

Felix Lungu

unread,
May 26, 2015, 3:54:28 PM5/26/15
to personal...@googlegroups.com

On 2015.05.26, at 15:54, Charles Skelton <cha...@kx.com> wrote:

Ā or call setm(1) ???

what is this function doing?

Charles Skelton

unread,
May 26, 2015, 4:09:49 PM5/26/15
to personal...@googlegroups.com
I setm(I f) is part of the sharedlib c-api; it can be called to activate locks around the interning of symbols in threads other than the kdb+ main thread. It should only be called when you can guarantee that no threads could be inside sn at the time of the call. e.g. call it from the kdb+ main thread on startup. It returns the previous setting.


--
You received this message because you are subscribed to the Google Groups "Kdb+ Personal Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to personal-kdbpl...@googlegroups.com.
To post to this group, send email to personal...@googlegroups.com.
Visit this group at http://groups.google.com/group/personal-kdbplus.
For more options, visit https://groups.google.com/d/optout.

Tiago Rodrigues

unread,
May 27, 2015, 8:18:20 AM5/27/15
to personal...@googlegroups.com
Thanks, the callback works nicely. I was making things harder than they needed to be.

Tiago

Jay Han

unread,
May 28, 2015, 2:58:59 AM5/28/15
to personal...@googlegroups.com
Charles:Ā 
- if k(0,...) and q function it calls do not write, can't they be run safely on non-main threads?
- does m9() inside thread take care of reference counts?


--

Jay Han

unread,
May 28, 2015, 3:01:34 AM5/28/15
to personal...@googlegroups.com
what is the meaning of the argument for setm(I f)?

...
setm(1);
ks("asymbolinternedforever");
setm(0); //???
...

Charles Skelton

unread,
May 28, 2015, 3:14:09 AM5/28/15
to personal...@googlegroups.com
>- if k(0,...) and q function it calls do not write, can't they be run safely on non-main threads?

If the q function does not read any global variables, it may be ok.

- does m9() inside thread take care of reference counts?

No, it just frees whatever was allocated for that thread.


Charles Skelton

unread,
May 28, 2015, 3:14:28 AM5/28/15
to personal...@googlegroups.com
> what is the meaning of the argument for setm(I f)?

The argument f is to activate locks around sym interning. If you need to intern from threads other than main, you should set this just once, from the main thread, and never reset it.

Charles Skelton

unread,
May 28, 2015, 3:21:34 AM5/28/15
to personal...@googlegroups.com
>>- if k(0,...) and q function it calls do not write, can't they be run safely on non-main threads?
>If the q function does not read any global variables, it may be ok.

Actually, not even then. There is still potential for ref count anomalies in the vm, which could lead to a crash after some time.

kdb+ itself handles these issues differently during peach/multithreaded input mode, but this is not available to the capi directly.

Tiago Rodrigues

unread,
May 28, 2015, 3:23:23 AM5/28/15
to personal...@googlegroups.com
I was wondering if there was a reference manual for the sharedlib API (beyond whats on the integrating/extending with C on the wiki). That would be really helpful to better understand the internals and avoid some head-banging on the wall :)

You received this message because you are subscribed to a topic in the Google Groups "Kdb+ Personal Developers" group.
To unsubscribe from this topic, visitĀ https://groups.google.com/d/topic/personal-kdbplus/xRtTODPLFt8/unsubscribe.
To unsubscribe from this group and all its topics, send an email toĀ personal-kdbpl...@googlegroups.com.

Charles Skelton

unread,
May 28, 2015, 3:29:33 AM5/28/15
to personal...@googlegroups.com
The internals of kdb+ are really quite complicated and subject to change; we try to hide that with the simple c-api.

Jay Han

unread,
May 28, 2015, 4:21:16 AM5/28/15
to personal...@googlegroups.com
when I read setm() above I thought thread-safety for ks()/ss().

if read-only access from non-main thread is not safe, the little descriptions for setm() and m9() are confusing... documentation does *not* say something like

you can launch threads with callbacks to main.
call setm() in the main once (and only once) before any threads use ks()/ss().
call m9() inside threads before exiting.

ok I am probably misreading mails and docs and second guessing too much!

Jay Han

unread,
May 28, 2015, 4:30:13 AM5/28/15
to personal...@googlegroups.com
another vote for reference doc. (lately i've been learning about a few new functions not appearing in wiki pages.)

something I've been wondering about: c api doesn't seem to include a dicionary lookup support.

Tiago Rodrigues

unread,
May 28, 2015, 6:06:33 AM5/28/15
to personal...@googlegroups.com
Hi Charles,

Ā  I really appreciate your help so far and hope I'm not abusing of your good will, but I still have some questions.

Ā For instance, this works on the main thread, but results on a kdb crash when in a thread:

char *CALLBACK = NULL;


K callback() {

Ā  K a=kp("C");

Ā  k(0,CALLBACK,a,(K)0);

Ā  R (K)0; }

Ā 

Z K1(test) {

Ā  CALLBACK=x->s;

Ā  sd1(0,callback);

Ā Ā R (K)0; }


The callback function in Q:

upd:{[x] update state:x from `.mbt.c}


Table definition:

c:([name:0#`] s:0#0N; state:0#`; server:0#`) Ā  Ā 



Also, there seems to be a 1-2 second delay on the update from the callback. Does it have some kind of deferred execution?Ā 

Here is an example:


q).mbt.c

name| s state server

----| --------------

qtĀ Ā | 6 NĀ  Ā  Ā  Ā  Ā 

q).mbt.test[`upd]

q).mbt.c Ā  Ā  Ā  <------ Ā About 1 second afterĀ .mbt.test call

name| s state server

----| --------------

qtĀ Ā | 6 NĀ  Ā  Ā  Ā  Ā  Ā 

q).mbt.c Ā  Ā  Ā  <------ Ā About 3 seconds after .mbt.test call

name| s state server

----| --------------

qtĀ Ā | 6 C Ā  Ā  Ā Ā 


This also means that if I call r0(a) after the call to k(0,...) it causes a sigsegv (I assume its freeing 'a' before finishing the update?)Ā 


q).mbt.c

name| s state server

----| --------------

qtĀ  | 6 NĀ  Ā  Ā  Ā  Ā  Ā 

q).mbt.test[`upd]

q).mbt.c

name| s state server

----| --------------

qtĀ  | 6 NĀ  Ā  Ā  Ā  Ā  Ā 

q).mbt.c

name| s state server

----| --------------

qtĀ  | 6 .Ā  Ā  Ā  Ā  Ā  Ā 

q)

rlwrap: warning: q crashed, killed by SIGSEGV.


Am I completely misunderstanding the correct way to do this?

Tiago

Charles Skelton

unread,
May 28, 2015, 7:09:44 AM5/28/15
to personal...@googlegroups.com
the first arg to sd1 should be a file descriptor that the main kdb+ loop can monitor for readable activity via the system select call. For example, see the section using eventfds at http://code.kx.com/wiki/Cookbook/InterfacingWithC
I don't see the sense in using stdin (fd 0) for the sd1 call sd1(0,...; I think that can only confuse kdb+ since it is already monitoring stdin.

For a system which is not under load, the latency should just be how long the select call takes to return - so a few uS. The callback is not deferred.

For these kinds of issues, it is really helpful to have a snippet of code that will fully compile and yet reproduces the issue.

Also please include (most of) the kdb+ startup banner as that has relevant details for the OS, kdb+ version/release etc.

andrew...@aquaq.co.uk

unread,
May 28, 2015, 11:44:59 AM5/28/15
to personal...@googlegroups.com
Thanks Charlie

We've updated that document recently and added some source code : https://github.com/AquaQAnalytics/kdb-c-interface

We've been building some feed handlers in C for clients recently and created a feed handler tutorial document which is probably also relevant. It walks through how to build an example feed handler in c as both a shared object loaded into the q process and a standalone executable (https://github.com/AquaQAnalytics/kdb-feedhandler-tutorial)
To unsubscribe from this group and stop receiving emails from it, send an email toĀ personal-kdbplus+unsub...@googlegroups.com.

To post to this group, send email toĀ personal...@googlegroups.com.
Visit this group atĀ http://groups.google.com/group/personal-kdbplus.
For more options, visitĀ https://groups.google.com/d/optout.

--Ā 
You received this message because you are subscribed to the Google Groups "Kdb+ Personal Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email toĀ personal-kdbplus+unsub...@googlegroups.com.

To post to this group, send email toĀ personal...@googlegroups.com.
Visit this group atĀ http://groups.google.com/group/personal-kdbplus.
For more options, visitĀ https://groups.google.com/d/optout.

--Ā 
You received this message because you are subscribed to the Google Groups "Kdb+ Personal Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email toĀ personal-kdbplus+unsub...@googlegroups.com.

To post to this group, send email toĀ personal...@googlegroups.com.
Visit this group atĀ http://groups.google.com/group/personal-kdbplus.
For more options, visitĀ https://groups.google.com/d/optout.


--Ā 
You received this message because you are subscribed to a topic in the Google Groups "Kdb+ Personal Developers" group.
To unsubscribe from this topic, visitĀ https://groups.google.com/d/topic/personal-kdbplus/xRtTODPLFt8/unsubscribe.
To unsubscribe from this group and all its topics, send an email toĀ personal-kdbplus+unsub...@googlegroups.com.

To post to this group, send email toĀ personal...@googlegroups.com.
Visit this group atĀ http://groups.google.com/group/personal-kdbplus.
For more options, visitĀ https://groups.google.com/d/optout.

Tiago Rodrigues

unread,
May 29, 2015, 1:39:04 PM5/29/15
to personal...@googlegroups.com
Charlie, you are right, using the stdin was causing the delay (it waited for some input on q console before triggering the callback) which as you said didn't make sense. The example feed handler Andrew linked to was quite helpful in understanding the way to follow and I've managed to get it working.

Andrew, if you won't mind me asking, is there any special reason for your choice of using sockets instead of a pipe to communicate to the main process or is it just a matter of personal preference?

mark....@aquaq.co.uk

unread,
May 29, 2015, 4:07:44 PM5/29/15
to personal...@googlegroups.com
Hi,

It's probably optimal in terms of performance to use pipes on a Linux machine, but we used sockets instead of pipes for a few reasons:

Ā  - Trying to keep the code simple: One of the goals was that the code should compile and run on both Linux and Windows, so we made use of the selectable socketpair available here: https://github.com/ncm/selectable-socketpair.
This made it easier to abstract over the different operating systems and is reliable well tested code. Keeping the code simple also allows people to focus on the interaction with the kdb+ api to see how it works rather than wading through
platform specific implementation details or having to bother with the differences between sockets vs pipes (e.g send/recv and read/write calls).

Ā  - Sockets are not that much slower than pipes on Linux: If you look at the implementation of the socket pair in the library above, you will see that it uses the AC_UNIX/AC_LOCAL socket family on Linux.
This is intended for communicating between processes/threads on the same machine and cuts down on a fair bit of the overhead you would get when using AC_INET sockets. On Windows I believe that the
performance gap between named pipes and sockets is bigger, but you can't just blindly replace the socket with a pipe because of the issue below.

Ā  - It's tricky to get pipes to work nicely with some functions on Windows: the implementations of pipes on Linux and Windows are very different, and in particular trying to use the select() function on a pipe fails.
This can make the sd1() callback hang if passed a file descriptor that belongs to a pipe. I think that the Windows implementation of select() will only work on the handles generated via WinSock (but I'm not 100%
sure on this).

I hope this answers your question!

Thanks

Tiago Rodrigues

unread,
May 30, 2015, 3:30:29 AM5/30/15
to personal...@googlegroups.com
Hi Mark,

Ā  I expect a pipe to be about 10% faster than a UNIX socket on OSX and Linux . This shouldn't be much of a problem since other parts of the code will probably be on the critical path well before the performance delta becomes an issue, Also, your note on windows select() failing on a pipe justifies the option for sockets on portability alone.

Ā  Just a note. If you'll be having multiple threads triggering the same callback function, a pipe will guarantee that the writes are atomic (if less thanĀ PIPE_BUF). The selectable-socketpair is using a STREAM socket, which is not guaranteed to be atomic and I assume could be problematic in this scenario.Ā One could probably change it to use a DATAGRAM socket to get over this (though I haven't tested). You probably knew this already, but might be useful for someoneĀ using the feedhandler tutorial as a template. :)

Ā  Again, thank you all for your answers.Ā They have been most helpful.

Tiago
Reply all
Reply to author
Forward
0 new messages