515 views

Skip to first unread message

Feb 18, 2012, 6:28:51 AM2/18/12

to

I have an application that resembles dictionary lookup, with significant

rates of collision.

The current approach is to initialize the hash table, a 10^4 element

list of empty sublists, and then build the table by inserting document

object id numbers into the correct sublists as indexed by some important

property of the object. The choice of property varies with application.

The property may be quantized geographic location, or some attribute of

the object itself.

When done building the hash table, query consists of reading the sublist

using an index derived in the same way as for the original building of

the list. Each sublist may have many entries, and the entries need not

be numbers.

Anyway, I have a workable approach using

HashList[[index]]=

Union[Flatten[Join[HashList[[index]],{new object ID}]]]]

as the core update function, but I'm wondering how well this will scale

to say 10^6 sublists (which may be needed to keep the collisions under

control with large datasets), and of course if there is a better way in

Mathematica.

I did look at SparseArrays, but they don't support elements that are

sublists.

Although this hash table is one dimensional, I have uses for 2D and

maybe 3D tables, which are very common when one is hashing physical

location.

Thanks in advance,

Joe Gwinn

rates of collision.

The current approach is to initialize the hash table, a 10^4 element

list of empty sublists, and then build the table by inserting document

object id numbers into the correct sublists as indexed by some important

property of the object. The choice of property varies with application.

The property may be quantized geographic location, or some attribute of

the object itself.

When done building the hash table, query consists of reading the sublist

using an index derived in the same way as for the original building of

the list. Each sublist may have many entries, and the entries need not

be numbers.

Anyway, I have a workable approach using

HashList[[index]]=

Union[Flatten[Join[HashList[[index]],{new object ID}]]]]

as the core update function, but I'm wondering how well this will scale

to say 10^6 sublists (which may be needed to keep the collisions under

control with large datasets), and of course if there is a better way in

Mathematica.

I did look at SparseArrays, but they don't support elements that are

sublists.

Although this hash table is one dimensional, I have uses for 2D and

maybe 3D tables, which are very common when one is hashing physical

location.

Thanks in advance,

Joe Gwinn

Feb 19, 2012, 6:29:47 AM2/19/12

to

considered using the DownValues of a symbol?

dict[_] = Null

dict["key1"] = "value1"

dict["key2"] = "value2"

This is a mutable data structure which might not play very well with

Mathematica's native paradigms, but it is often very useful.

You can also consider using a list of rules:

rules = {"key1" -> "value1", "key2" -> "value2"}

You can build a dispatch table using Dispatch[rules] to make

replacements very fast, but every time you add an element, it will be

necessary to re-build the Dispatch object from the rules, which might

take a while.

--

Szabolcs Horvát

Visit Mathematica.SE: http://mathematica.stackexchange.com/

Feb 19, 2012, 6:36:27 AM2/19/12

to

would try the built in mechanisms of Mathematica. For example, here is a

trivial loop which creates a 'hash table' called hashtable, with values

f[key]:

kk=1;

While[kk < 100000,

hashtable[kk]=f[kk];

kk++;

];

Of course, in this example the keys to the table are just the integers,

but the built-in mechanism will work with keys of any structure.

When a function is defined in this way as lots of individual values,

Mathematica uses techniques (probably hash) to access the values. Try it

and see how efficient it is!

David Bailey

http://www.dbaileyconsultancy.co.uk

Feb 19, 2012, 6:36:58 AM2/19/12

to

On 2/18/2012 3:28 AM, Joseph Gwinn wrote:

> I have an application that resembles dictionary lookup, with significant

> rates of collision.

>

>...
> I have an application that resembles dictionary lookup, with significant

> rates of collision.

>

Mathematica presumably already uses hashing for storage of stuff like

f[{a,b,c}] = value1

f[{1,2,c}] = value2

etc.

Initializing a hashtable with 10^4 (identical?) empty keys makes no

sense. Did I misunderstand your message?

Feb 20, 2012, 2:48:47 AM2/20/12

to

In article <jhqmrr$fbq$1...@smc.vnet.net>,

The table has to be updated more or less at random as the dataset is

digested. What I'm doing is essentially your above method, but not all

elements will ever receive a value. In fact, most will not.

The reason to initialize the hashtable in advance is so a later probe to

an empty table element yields {}, versus repeating the question back.

> When a function is defined in this way as lots of individual values,

> Mathematica uses techniques (probably hash) to access the values. Try it

> and see how efficient it is!

Mathematica does uses hash tables internally, to good effect. There is

a discussion of this with respect to internal implementation of sparse

arrays, but I assume that hash tables are far more widely used inside

Mathematica.

For the record, here is my core hashtable algorithm, with transient

details omitted:

HashIndexModulus = 10000;

HashTable=Table[{},{HashIndexModulus}];

HashValuesList = Map[HashingFunction, dataset];

Do[

Do[HashTable[[i]] = Join[HashTable[[i]], {j}] // Union, {i,

HashValuesList[[j]]}

],

{j, Length[HashValuesList]}

];

The dataset in the above case is a list containing thousands of strings,

the strings being on average 100 characters long.

The code takes about 3 seconds on a 3.2 GHz intel iMac, processing about

1200 strings, and the computation time does not much vary with the size

of the hashtable. I would expect it to instead vary directly with the

total number of characters in the dataset.

Above 10^4 elements, creating the empty hashtable is linear in the

length of the hashtable, costing on average 53 to 54 nanoseconds per

element. Below 10^4, elements, one can see evidence of internal

gearshifting within the kernel. In any event, it takes 55 milliseconds

to init a 10^6 element hashtable, so this isn't a problem.

Joe Gwinn

digested. What I'm doing is essentially your above method, but not all

elements will ever receive a value. In fact, most will not.

The reason to initialize the hashtable in advance is so a later probe to

an empty table element yields {}, versus repeating the question back.

> When a function is defined in this way as lots of individual values,

> Mathematica uses techniques (probably hash) to access the values. Try it

> and see how efficient it is!

a discussion of this with respect to internal implementation of sparse

arrays, but I assume that hash tables are far more widely used inside

Mathematica.

For the record, here is my core hashtable algorithm, with transient

details omitted:

HashIndexModulus = 10000;

HashTable=Table[{},{HashIndexModulus}];

HashValuesList = Map[HashingFunction, dataset];

Do[

Do[HashTable[[i]] = Join[HashTable[[i]], {j}] // Union, {i,

HashValuesList[[j]]}

],

{j, Length[HashValuesList]}

];

The dataset in the above case is a list containing thousands of strings,

the strings being on average 100 characters long.

The code takes about 3 seconds on a 3.2 GHz intel iMac, processing about

1200 strings, and the computation time does not much vary with the size

of the hashtable. I would expect it to instead vary directly with the

total number of characters in the dataset.

Above 10^4 elements, creating the empty hashtable is linear in the

length of the hashtable, costing on average 53 to 54 nanoseconds per

element. Below 10^4, elements, one can see evidence of internal

gearshifting within the kernel. In any event, it takes 55 milliseconds

to init a 10^6 element hashtable, so this isn't a problem.

Joe Gwinn

Feb 20, 2012, 2:49:48 AM2/20/12

to

In article <jhqmsq$fbt$1...@smc.vnet.net>,

No, you understood correctly. Please see my response to David Bailey

for the rationale.

Joe Gwinn

for the rationale.

Joe Gwinn

Feb 20, 2012, 2:50:49 AM2/20/12

to

In article <jhqmfb$f7i$1...@smc.vnet.net>,

Each key value would need to support a perhaps empty perhaps large list

of values.

> You can build a dispatch table using Dispatch[rules] to make

> replacements very fast, but every time you add an element, it will be

> necessary to re-build the Dispatch object from the rules, which might

> take a while.

I did consider rules and Dispatch, but they seemed mismatched and thus

overly complex (and slow) compared to the direct approach with lists.

Please see my response to David Bailey for the details of my approach.

Joe Gwinn

of values.

> You can build a dispatch table using Dispatch[rules] to make

> replacements very fast, but every time you add an element, it will be

> necessary to re-build the Dispatch object from the rules, which might

> take a while.

overly complex (and slow) compared to the direct approach with lists.

Please see my response to David Bailey for the details of my approach.

Joe Gwinn

Feb 21, 2012, 6:11:40 AM2/21/12

to

thing by typing

f[x_]:= {}

if {} is the default empty value.

RJF

Feb 21, 2012, 6:20:22 AM2/21/12

to

Hi,

> The table has to be updated more or less at random as the dataset is

> digested. What I'm doing is essentially your above method, but not all

> elements will ever receive a value. In fact, most will not.

>

> The reason to initialize the hashtable in advance is so a later probe to

> an empty table element yields {}, versus repeating the question back.

Honestly, despite your additional explanations I don't feel I fully

understand what you are trying to do, but the following assumes that it

is essential that you use your own hash function instead of what

Mathematica would internally use.

You should note that David's suggestion to use DownValues to store the

key/value pairs will allow for returning empty lists for the keys that

have not been met yet, even without preallocating entries or a priori

knowledge of the hash index modulus. You would simply "initialize" with

a blank pattern (see second line):

ClearAll[hashtable];

hashtable[_] = {};

Module[{index},

Scan[(

index = hashingfunction[#];

hashtable[index] = Union[Join[hashtable[index], {#}]]

) &,

dataset

];

]

It is difficult to make a statement about runtime performance compared

to your approach, since it will depend on the hash function you are

using. But I see some advantages beside runtime performance: the

approach will work without knowledge of the hash index modulus and it

will be more memory efficient if the modulus is very large and thus

there are many empty values (in general I think that a large modulus is

what you usually are after when looking for a good hash function).

You might want to try the above with your hash function, with

hashingfunction = Hash;

dataset = DictionaryLookup["*t*"];

the above will take 0.343 Seconds for about 41000 strings on my machine,

which is presumably much slower than your machine. Using StringLength as

hashfunction slows things down considerable, most probably because of

the need to reallocate memory for the growing value arrays. I think that

this is also dominating the runtime of your approach, and because

everything is in one large array even more so. For my example with

hashfunction=StringLength building the hash table using DownValues seems

to be a lot faster...

hth,

albert

> The table has to be updated more or less at random as the dataset is

> digested. What I'm doing is essentially your above method, but not all

> elements will ever receive a value. In fact, most will not.

>

> The reason to initialize the hashtable in advance is so a later probe to

> an empty table element yields {}, versus repeating the question back.

understand what you are trying to do, but the following assumes that it

is essential that you use your own hash function instead of what

Mathematica would internally use.

You should note that David's suggestion to use DownValues to store the

key/value pairs will allow for returning empty lists for the keys that

have not been met yet, even without preallocating entries or a priori

knowledge of the hash index modulus. You would simply "initialize" with

a blank pattern (see second line):

ClearAll[hashtable];

hashtable[_] = {};

Module[{index},

Scan[(

index = hashingfunction[#];

hashtable[index] = Union[Join[hashtable[index], {#}]]

) &,

dataset

];

]

It is difficult to make a statement about runtime performance compared

to your approach, since it will depend on the hash function you are

using. But I see some advantages beside runtime performance: the

approach will work without knowledge of the hash index modulus and it

will be more memory efficient if the modulus is very large and thus

there are many empty values (in general I think that a large modulus is

what you usually are after when looking for a good hash function).

You might want to try the above with your hash function, with

hashingfunction = Hash;

dataset = DictionaryLookup["*t*"];

the above will take 0.343 Seconds for about 41000 strings on my machine,

which is presumably much slower than your machine. Using StringLength as

hashfunction slows things down considerable, most probably because of

the need to reallocate memory for the growing value arrays. I think that

this is also dominating the runtime of your approach, and because

everything is in one large array even more so. For my example with

hashfunction=StringLength building the hash table using DownValues seems

to be a lot faster...

hth,

albert

Feb 21, 2012, 6:21:54 AM2/21/12

to

On 2/20/2012 9:48 AM, Joseph Gwinn wrote:

>

> The reason to initialize the hashtable in advance is so a later probe to

> an empty table element yields {}, versus repeating the question back.

>

>

I probably don't understand your question completely, but why can't you
>

> The reason to initialize the hashtable in advance is so a later probe to

> an empty table element yields {}, versus repeating the question back.

>

>

just make definition like

hashtable[_] = {}

? This will ensure that {} is returned for "undefined" keys.

Also, please avoid using symbols that start with a capital letter.

HashTable is a built-in symbol and if you define it (as you did in the

example code you sent), it is likely to break stuff.

Feb 22, 2012, 5:34:48 AM2/22/12

to

Hi Joe,

I suspect that most of the time is spent it adding to sub-lists, rather

than hashing proper. I suggest to use a DownValues-based hash-table with

linked lists, as follows.

Here is a linked list "API":

ClearAll[linkedList, toLinkedList, fromLinkedList, addToList, pop,

emptyList];

SetAttributes[linkedList, HoldAllComplete];

toLinkedList[data_List] := Fold[linkedList, linkedList[], data];

fromLinkedList[ll_linkedList] :=

List @@ Flatten[ll, Infinity, linkedList];

addToList[ll_, value_] := linkedList[ll, value];

pop[ll_] := Last@ll;

emptyList[] := linkedList[];

Here is a simplistic hash table constructor (I exploit the fact that you

don't have to delete

key-value pairs):

ClearAll[makeHash];

makeHash[] :=

Module[{hash},

hash[_] := emptyList[];

hash /: add[hash, key_, value_] :=

hash[key] = addToList[hash[key], value];

hash /: get[hash, key_] := fromLinkedList@hash[key];

Return[hash]];

Let's create a new hash table:

In[10]:= hash = makeHash[]

Out[10]= hash$777

Now, I will populate it with the words of built-in English dictionary, the

keys being first two letters of a word:

values = DictionaryLookup["*"];

keys = StringTake[#, Min[2, StringLength[#]]] & /@ values;

It takes only half a second to fill the hash table, on my machine:

In[14]:= (add[hash,##]&@@@Transpose[{keys,values}]);//Timing

Out[14]= {0.515,Null}

for 92518 words.

Here is how you make a lookup:

In[19]:= get[hash,"pr"]//Short

Out[19]//Short=

{practicabilities,practicability,practicable,practicably,<<1696>>,prussic,pry,prying}

You can also use the hash table from System`Utilities`. The following is an

analogous constructor based on that:

ClearAll[makeHashAlt];

makeHashAlt[] :=

Module[{hash, innerHash},

innerHash = System`Utilities`HashTable[];

hash /: add[hash, key_, value_] :=

System`Utilities`HashTableAdd[innerHash, key,

addToList[

If[System`Utilities`HashTableContainsQ[innerHash, key],

With[{res = System`Utilities`HashTableGet[innerHash, key]},

System`Utilities`HashTableRemove[innerHash, key];

res],

(* else *)

emptyList[]

],

value]];

hash /:

get[hash, key_] /; !

System`Utilities`HashTableContainsQ[innerHash, key] := {};

hash /: get[hash, key_] :=

fromLinkedList@System`Utilities`HashTableGet[innerHash, key];

Return[hash]];

Create a new hash:

In[48]:= newhash = makeHashAlt[]

Out[48]= hash$1814

The performance will be slightly worse:

In[49]:= (add[newhash,##]&@@@Transpose[{keys,values}]);//Timing

Out[49]= {0.671,Null}

You can use it in the same way as the first version.

In[50]:= get[newhash,"pr"]//Short

Out[50]//Short=

{practicabilities,practicability,practicable,practicably,<<1696>>,prussic,pry,prying}

The whole point here is that this should scale really well, since the

insertion is always constant-time.

For hashes based on plain lists, this will be worse as the lists for a

given key grow larger (you will have

an average insertion time be proportional roughly to the average length of

the accumulated sublist).

Hope this helps. Consider cross-posting to

http://mathematica.stackexchange.com, to have some

more eyes on your problems of interest.

Cheers,

Leonid

On Mon, Feb 20, 2012 at 10:46 AM, Joseph Gwinn <joeg...@comcast.net> wrote:

> In article <jhqmrr$fbq$1...@smc.vnet.net>,

> David Bailey <da...@removedbailey.co.uk> wrote:

>

> > On 18/02/2012 11:28, Joseph Gwinn wrote:

I suspect that most of the time is spent it adding to sub-lists, rather

than hashing proper. I suggest to use a DownValues-based hash-table with

linked lists, as follows.

Here is a linked list "API":

ClearAll[linkedList, toLinkedList, fromLinkedList, addToList, pop,

emptyList];

SetAttributes[linkedList, HoldAllComplete];

toLinkedList[data_List] := Fold[linkedList, linkedList[], data];

fromLinkedList[ll_linkedList] :=

List @@ Flatten[ll, Infinity, linkedList];

addToList[ll_, value_] := linkedList[ll, value];

pop[ll_] := Last@ll;

emptyList[] := linkedList[];

Here is a simplistic hash table constructor (I exploit the fact that you

don't have to delete

key-value pairs):

ClearAll[makeHash];

makeHash[] :=

Module[{hash},

hash[_] := emptyList[];

hash /: add[hash, key_, value_] :=

hash[key] = addToList[hash[key], value];

hash /: get[hash, key_] := fromLinkedList@hash[key];

Return[hash]];

Let's create a new hash table:

In[10]:= hash = makeHash[]

Out[10]= hash$777

Now, I will populate it with the words of built-in English dictionary, the

keys being first two letters of a word:

values = DictionaryLookup["*"];

keys = StringTake[#, Min[2, StringLength[#]]] & /@ values;

It takes only half a second to fill the hash table, on my machine:

In[14]:= (add[hash,##]&@@@Transpose[{keys,values}]);//Timing

Out[14]= {0.515,Null}

for 92518 words.

Here is how you make a lookup:

In[19]:= get[hash,"pr"]//Short

Out[19]//Short=

{practicabilities,practicability,practicable,practicably,<<1696>>,prussic,pry,prying}

You can also use the hash table from System`Utilities`. The following is an

analogous constructor based on that:

ClearAll[makeHashAlt];

makeHashAlt[] :=

Module[{hash, innerHash},

innerHash = System`Utilities`HashTable[];

hash /: add[hash, key_, value_] :=

System`Utilities`HashTableAdd[innerHash, key,

addToList[

If[System`Utilities`HashTableContainsQ[innerHash, key],

With[{res = System`Utilities`HashTableGet[innerHash, key]},

System`Utilities`HashTableRemove[innerHash, key];

res],

(* else *)

emptyList[]

],

value]];

hash /:

get[hash, key_] /; !

System`Utilities`HashTableContainsQ[innerHash, key] := {};

hash /: get[hash, key_] :=

fromLinkedList@System`Utilities`HashTableGet[innerHash, key];

Return[hash]];

Create a new hash:

In[48]:= newhash = makeHashAlt[]

Out[48]= hash$1814

The performance will be slightly worse:

In[49]:= (add[newhash,##]&@@@Transpose[{keys,values}]);//Timing

Out[49]= {0.671,Null}

You can use it in the same way as the first version.

In[50]:= get[newhash,"pr"]//Short

Out[50]//Short=

{practicabilities,practicability,practicable,practicably,<<1696>>,prussic,pry,prying}

The whole point here is that this should scale really well, since the

insertion is always constant-time.

For hashes based on plain lists, this will be worse as the lists for a

given key grow larger (you will have

an average insertion time be proportional roughly to the average length of

the accumulated sublist).

Hope this helps. Consider cross-posting to

http://mathematica.stackexchange.com, to have some

more eyes on your problems of interest.

Cheers,

Leonid

On Mon, Feb 20, 2012 at 10:46 AM, Joseph Gwinn <joeg...@comcast.net> wrote:

> In article <jhqmrr$fbq$1...@smc.vnet.net>,

> David Bailey <da...@removedbailey.co.uk> wrote:

>

> > On 18/02/2012 11:28, Joseph Gwinn wrote:

> > Before constructing a hash table (other than for pedagogical reasons), I

> > would try the built in mechanisms of Mathematica. For example, here is a

> > trivial loop which creates a 'hash table' called hashtable, with values

> > f[key]:

> >

> > kk=1;

> > While[kk < 100000,

> > hashtable[kk]=f[kk];

> > kk++;

> > ];

> >

> > Of course, in this example the keys to the table are just the integers,

> > but the built-in mechanism will work with keys of any structure.

>

> > would try the built in mechanisms of Mathematica. For example, here is a

> > trivial loop which creates a 'hash table' called hashtable, with values

> > f[key]:

> >

> > kk=1;

> > While[kk < 100000,

> > hashtable[kk]=f[kk];

> > kk++;

> > ];

> >

> > Of course, in this example the keys to the table are just the integers,

> > but the built-in mechanism will work with keys of any structure.

>

> The table has to be updated more or less at random as the dataset is

> digested. What I'm doing is essentially your above method, but not all

> elements will ever receive a value. In fact, most will not.

>

> digested. What I'm doing is essentially your above method, but not all

> elements will ever receive a value. In fact, most will not.

>

> The reason to initialize the hashtable in advance is so a later probe to

> an empty table element yields {}, versus repeating the question back.

>

>

> an empty table element yields {}, versus repeating the question back.

>

>

Feb 22, 2012, 5:35:50 AM2/22/12

to

In article <jhvuoi$891$1...@smc.vnet.net>,

hashtable that's empty, it's this or that element of said hashtable.

> Also, please avoid using symbols that start with a capital letter.

> HashTable is a built-in symbol and if you define it (as you did in the

> example code you sent), it is likely to break stuff.

Well, I use initial (and embedded) caps all the time, for readable

symbols, and Mathematica does sometimes balk, so I make the symbol a word

longer.

As for HashTable, Mathematica complained, so I changed it to BigHashTable, and

peace was restored.

Joe Gwinn

Szabolcs Horvát <szho...@gmail.com> wrote:

> On 2/20/2012 9:48 AM, Joseph Gwinn wrote:

> >

> > The reason to initialize the hashtable in advance is so a later probe to

> > an empty table element yields {}, versus repeating the question back.

> >

> >

>

> I probably don't understand your question completely, but why can't you

> just make definition like

>

> hashtable[_] = {}

>

> ? This will ensure that {} is returned for "undefined" keys.

I'll have to try this and related approaches, but it's not the entire
> On 2/20/2012 9:48 AM, Joseph Gwinn wrote:

> >

> > The reason to initialize the hashtable in advance is so a later probe to

> > an empty table element yields {}, versus repeating the question back.

> >

> >

>

> I probably don't understand your question completely, but why can't you

> just make definition like

>

> hashtable[_] = {}

>

> ? This will ensure that {} is returned for "undefined" keys.

hashtable that's empty, it's this or that element of said hashtable.

> Also, please avoid using symbols that start with a capital letter.

> HashTable is a built-in symbol and if you define it (as you did in the

> example code you sent), it is likely to break stuff.

symbols, and Mathematica does sometimes balk, so I make the symbol a word

longer.

As for HashTable, Mathematica complained, so I changed it to BigHashTable, and

peace was restored.

Joe Gwinn

Feb 22, 2012, 5:36:20 AM2/22/12

to

In article <jhvulm$88e$1...@smc.vnet.net>, A Retey <aw...@gmx-topmail.de>

wrote:

> Hi,

>

> > The table has to be updated more or less at random as the dataset is

> > digested. What I'm doing is essentially your above method, but not all

> > elements will ever receive a value. In fact, most will not.

> >

> > The reason to initialize the hashtable in advance is so a later probe to

> > an empty table element yields {}, versus repeating the question back.

>

> Honestly, despite your additional explanations I don't feel I fully

> understand what you are trying to do, but the following assumes that it

> is essential that you use your own hash function instead of what

> Mathematica would internally use.

Two reasons. First, the nature of the hash function should not affect

how the hash table is built and subsequently probed. Second, hash

functions have mathematical properties, and not all hash functions are

suited to any given hash function. For instance, for handling many

kinds of data, like natural language text, order matters, so the

permutations of "abc" must all be distinct. Well, three reasons: The

hash functions like md5 are far too heavy for the current purposes.

I have an adequate solution for my present purposes, but wanted to know

if there were better ways. I've gotten some good suggestions. I plan

to try them all and publish the results on this newsgroup.

Joe Gwinn

wrote:

> Hi,

>

> > The table has to be updated more or less at random as the dataset is

> > digested. What I'm doing is essentially your above method, but not all

> > elements will ever receive a value. In fact, most will not.

> >

> > The reason to initialize the hashtable in advance is so a later probe to

> > an empty table element yields {}, versus repeating the question back.

>

> Honestly, despite your additional explanations I don't feel I fully

> understand what you are trying to do, but the following assumes that it

> is essential that you use your own hash function instead of what

> Mathematica would internally use.

how the hash table is built and subsequently probed. Second, hash

functions have mathematical properties, and not all hash functions are

suited to any given hash function. For instance, for handling many

kinds of data, like natural language text, order matters, so the

permutations of "abc" must all be distinct. Well, three reasons: The

hash functions like md5 are far too heavy for the current purposes.

if there were better ways. I've gotten some good suggestions. I plan

to try them all and publish the results on this newsgroup.

Joe Gwinn

Feb 22, 2012, 5:36:51 AM2/22/12

to

In article <jhvu5c$80d$1...@smc.vnet.net>,

not the table that's empty, it's this or that specific cell that's

empty, in an unpredictable and changing pattern. So one would be

attempting to update and access f[x], where x is a random integer. So,

I just tried this, and it works just fine, including handling variable

lists.

So, this and related approaches may be workable. I will have to

implement them in a real application, and check performance and

scalability.

Joe Gwinn

Richard Fateman <fat...@cs.berkeley.edu> wrote:

> On 2/19/2012 11:49 PM, Joseph Gwinn wrote:

> > In article<jhqmsq$fbt$1...@smc.vnet.net>,

> > Richard Fateman<fat...@cs.berkeley.edu> wrote:

>

> >> Initializing a hashtable with 10^4 (identical?) empty keys makes no

> >> sense. Did I misunderstand your message?

> >

> > No, you understood correctly. Please see my response to David Bailey

> > for the rationale.

> >

> > Joe Gwinn

> >

> Again, I may misunderstand, but I think you can accomplish the same

> thing by typing

>

> f[x_]:= {}

>

> if {} is the default empty value.

This creates the hashtable, and gives it a name. In a hashtable, it is
> On 2/19/2012 11:49 PM, Joseph Gwinn wrote:

> > In article<jhqmsq$fbt$1...@smc.vnet.net>,

> > Richard Fateman<fat...@cs.berkeley.edu> wrote:

>

> >> Initializing a hashtable with 10^4 (identical?) empty keys makes no

> >> sense. Did I misunderstand your message?

> >

> > No, you understood correctly. Please see my response to David Bailey

> > for the rationale.

> >

> > Joe Gwinn

> >

> Again, I may misunderstand, but I think you can accomplish the same

> thing by typing

>

> f[x_]:= {}

>

> if {} is the default empty value.

not the table that's empty, it's this or that specific cell that's

empty, in an unpredictable and changing pattern. So one would be

attempting to update and access f[x], where x is a random integer. So,

I just tried this, and it works just fine, including handling variable

lists.

So, this and related approaches may be workable. I will have to

implement them in a real application, and check performance and

scalability.

Joe Gwinn

Feb 23, 2012, 5:51:22 AM2/23/12

to

On 2/22/2012 4:34 AM, Leonid Shifrin wrote:

>

> You can also use the hash table from System`Utilities`.

How can one obtain documentation on these functions?
>

> You can also use the hash table from System`Utilities`.

Very Hard to use something, and even program for an API that

is not documented.

I tried

?? System`Utilities`HashTable

and it all returned is

Attributes[System`Utilities`HashTable]={Protected}

googling did not help. Any other place to obtain documentation?

And why are these not in the help system for Mathematica? I am

using V 8.04

thanks,

--Nasser

Feb 24, 2012, 1:00:01 AM2/24/12

to

Nasser,

Sorry, can't add more here. I myself learned about these functions very

recently from this post by Oleksandr Rasputinov:

http://mathematica.stackexchange.com/questions/990/struct-equivalent-in-mathematica/1029#1029

And that's pretty much all publicly available information on this, I

believe. These functions probably serve internal purposes, and so are undocumented. This is a normal practice, anyone who is willing to dig out and use undocumented functionality should accept that it is, well, undocumented.

Cheers,

Leonid

Sorry, can't add more here. I myself learned about these functions very

recently from this post by Oleksandr Rasputinov:

http://mathematica.stackexchange.com/questions/990/struct-equivalent-in-mathematica/1029#1029

And that's pretty much all publicly available information on this, I

believe. These functions probably serve internal purposes, and so are undocumented. This is a normal practice, anyone who is willing to dig out and use undocumented functionality should accept that it is, well, undocumented.

Cheers,

Leonid

Reply all

Reply to author

Forward

0 new messages

Search

Clear search

Close search

Google apps

Main menu