Beta-12 introduces range query support for Dictionaries. To upgrade the addon:
load 'pacman'
'update' jpkg ''
'upgrade' jpkg 'data/dictionary'
Documentation now includes an implementation of the well-known shortest paths Dijkstra's algorithm using a tree dictionary.
Marcin
Hi Pascal,
If you plan to use multiple dictionaries that need to be indexed as elements of an array, at least two approaches are possible:
1. Array of dictionaries.
load 'dictionary'
]mydicts =: {{'hash' conew 'jdictionary'}}"0@:i. 7 NB. Create array of 7 dictionaries.
┌─┬─┬─┬─┬─┬─┬─┐
│0│1│2│3│4│5│6│
└─┴─┴─┴─┴─┴─┴─┘
]d =: 3 { mydicts NB. Select one of dictionaries.
┌─┐
│3│
└─┘
(put__d~ 100&+) i. 3
get__d i. 3
100 101 102
2. Use one dictionary instead of an array, encoding the array index directly in the key.
]mydict =: 'jdictionary' conew~ 'hash' ,&< 'keyshape' ; 2 NB. Keys are (index, true key).
┌─┐
│7│
└─┘
(100&+ put__mydict 3&,.) i. 3
get__mydict 3 ,. i. 3
100 101 102
While the first approach may be limiting when you want to put keys and values to multiple dictionaries within a single J sentence, the second approach does not seem to have this limitation, because array of dictionaries is simulated by one dictionary with expanded keys. Also, the second approach may be beneficial for range queries, because everything is stored in a single dictionary. For range queries, create the dictionary as 'tree' instead of 'hash'.
Is any of these approaches suitable for your planned use case of dictionary?
Marcin
> What I think you are asking for is a more gentlemanly dictionary, where
updates are applied one at a time and you can look at the keys or values
at your leisure. Maybe I can think of it as an offline version of a
general dictionary, one in which updates from other threads are not
possible.
I'll start here, but rest is more deferential. No. while small dictionaries are useful, I believe in bulk operations, and espeically helper modifiers to abstract complex tasks. I believe functional approach to dictionaries makes these functions easier. I'll repost https://github.com/Pascal-J/kv as my dictionary implementation. It is meant for high performance, in its own way, but also convenient syntax. My current project is to create a generic search/solver function generalized from the complexity of a chess search engine. Dictionaries are a useful means of organizing all of the board state data, and very large data structures can occur in expanding and evaluating all legal moves as nested dictionaries of same fundamental format. threading is possible as long as each thread explores a single move, and parent thread takes care of updating its key/value for that move/sub dictionary. A generalized search/solver function is easier with dictionaries than a flat table where legal moves can be 0 1 or many, and expanding the table must guard against each of these edge cases. Recursive descent on a dictionary structure is an easy path to generalize to all problems... I hope.
> the symbols implementation was clumsy and restricted and that we would remove it when we had a good replacement.
If I had a dictionary with 100 or 1000+ keys, I'd definitely look at using your implementation.I don't actually know if the overall overhead of using a dictionary with 10-40 keys with symbols (symbol creation overhead vs get savings) is better performance overall than boxed strings. Instead of high key volume, my kv implementation, has functions to enhance inverted tables, where a key is a field name. It seems very approachable to build a RDBMS system this way with tables as higher dictionary and join information part of individual table dictionaries. But then symbols acting as indexed/hash search in one column to get "record" seems useful, and one key per column makes querying easy to read.
while kv is meant to have any keys/values, practically, general dictionaries have symbol keys, and boxed (mixed) values in kv, through autopromotion. The kvi function is meant to "freeze" a dictionary for retrieval with &i:,and that speedup could still work with boxed strings instead of symbols. While I haven't tested the speedup for symbol lookup on fairly small kvs/columns, but don't notice an expense in creating symbols, using boxed strings is unlikely on small kvs to make them not worthwhile. The key optimization in kv is append based updates/deletes where null code is equivalent to a deleted key. Ordering and permitted duplicates is a nice feature.
One implementation of RDBMS type tables I believe you want to optimize for is keys fieldnames, fields, unique indexes, non unique indexes, as boxed shape 1 values. where indexes are sub dictionaries with row key, position value to use when retrieving from fields (inverted table oriented). In kv, I would just have used symbols for anything that wants an index as automagical simplicity. Perhaps in your concurrent access focus, you would want row oriented tables? with only one key/index? I worry that other applications than your optimized focus will be degraded by removing symbols. I'm guessing at workarounds needed.
> I have just now fixed the implementation so that it does not expose any noun that would allow a user to crash the system. You must not use the 16!:n functions that have negative n.
That could be a regression. get =: dict 16!:_2 is easily transformable to get =: 4: : 'y 16!:_2 x'"1 0 . It can be good news if it is impossible to crash J by any access to dict. It;s bad news if you figured out a way somehow to hide dict inside of object. It's easy to avoid 1 {:: dict in wrapper functions. It is harder to never forget (1) 16!:_5 before a temporary dict goes out of scope, and to not have documented causes of crashes. The latter is the only complexity. Whether or not you change the internals of get/put, they will probably keep their 16!:_ signatures, and if they don't I can update wrappers. Whether or not documented, they are exemplified in dictionary class/file. < dict not crashing (it doesn't) is for embedding a dictionary as a keyed value. My only key requests is that < dict continues to work, and (1) 16!:_5 before destruction is not needed, while assuming that a temporary dict created in a function gets erased when function ends.