Along a related line, you could think about adding a template mechanism that would enable users to create dictionary encodings for each table column, as an alternative to symbol type columns
with the standard enumeration. This approach would allow users to encode strings as char, short or int types, on a per column basis.
The main benefit is a reduced in-memory and on-disk table size, faster writes and improved query speed.
Memory is reduced because you could switch out long ints (symbol type) for potentially char, short or ints. Writers are faster because you could potentially remove .Q.en if you
have no symbols. Lookups using char or short are also faster I find.
Currently, in the latest TorQ, when you query the intraday DB, you already provide a mechanism (maptopoint) for mapping symbol to their corresponding long.
neg[gwHandle](`.gw.asyncexec;"select from trade where int=maptoint[`GOOG]";`idb);gwHandle[]
The idea would be to provide a general mechanism/framework so that users can define their encodings for a range of columns.
"select from ExecutionReport where symbol = SymbolMap[`GOOG], order_type = OrderTypeMap[`Limit], side = SideMap[`BUY]"
While this approach does make the query syntax a bit more verbose, it can lead to substantial memory savings.
Additionally, feed handlers publishing data to kdb+ often prefer to avoid strings and work exclusively with integers, making it essential to have a place to store the word associated with each integer.