Since PRIN1 does not preserve the hash contents to the output file (it
writes just the hash handle, e.g. "#<EQUALP hash table, 11723 entries
{581579ED}>"), I've written a simple routine to convert the hash to a
two dimensional array instead:
(defun hash-to-2D-array (hash)
(let ((elems (hash-table-count hash)))
(when (> elems 0)
(let ((index (make-array (list elems 2)))
(counter 0))
(maphash #'(lambda (k v)
(setf (aref index counter 0) k)
(setf (aref index counter 1) v)
(incf counter))
hash)
index))))
Because the hash table is simple (each key is a string and each value
is a list of strings), using PRIN1 on the resulting array does
serialize all the content.
Although it's functional, this conversion method is slow: for example,
even for hash tables containing around 25,000 entries, it takes several
seconds to produce the array.
Is there a better way of writing (hash-to-2D-array) or is there a
better way to serialize a simple hash table without having to convert
to an array first?
I don't understand. Do you need to print a hash table, or do you need
to convert it to a 2D array?
(defpackage "MINE" (:use "CL"))
(defun mine::hash-table (&key size test rehash-size rehash-threshold elements)
(let ((table (make-hash-table :size size :test test
:rehash-size rehash-size
:rehash-threshold rehash-threshold)))
(dolist (item elements table)
(setf (gethash (first item) table) (second item)))))
(defmethod print-object ((self hash-table) stream)
(format stream "#.(MINE::HASH-TABLE :SIZE ~D :TEST (FUNCTION ~S) ~%~
~& :REHASH-SIZE ~A :REHASH-THRESHOLD ~A~%~
~& :ELEMENTS '("
(hash-table-count self) (hash-table-test self)
(hash-table-rehash-size self) (hash-table-rehash-threshold self))
(maphash (lambda (k v) (format stream "~%(~S ~S)" k v)) self)
(format stream "))")
self)
(print-object (hash-table :size 3 :test (function equal)
:rehash-size 1.5 :rehash-threshold 0.75
:elements '((a 1) (b 2) (c 3))) t)
prints:
#.(MINE::HASH-TABLE :SIZE 3 :TEST (FUNCTION EXT:FASTHASH-EQUAL)
:REHASH-SIZE 1.5S0 :REHASH-THRESHOLD 0.75s0
:ELEMENTS '(
(C 3)
(B 2)
(A 1)))
returns:
#S(HASH-TABLE :TEST EXT:FASTHASH-EQUAL (C . 3) (B . 2) (A . 1))
Of course, the easiest way would be to use the right implementation...
[53]> (let ((h (mine::hash-table :size 25000 :test (function EXT:FASTHASH-EQUAL)
:rehash-size 1.5s0 :rehash-threshold 0.75s0
:elements (loop :for i :from 0 :below 25000 :collect (list (format nil "~R" i) i)))))
(time (prog1 nil (with-output-to-string (out) (print-object h out)))))
Real time: 2.427929 sec.
Run time: 2.416151 sec.
Space: 44662496 Bytes
GC: 21, GC time: 0.832054 sec.
NIL
[54]>
2.5 seconds not even compiled!
--
__Pascal Bourguignon__ http://www.informatimago.com/
Un chat errant
se soulage
dans le jardin d'hiver
Shiki
The goal is to be able to serialize the hash table to a file, so that
it can be read back into memory as the same hash table object at a
later time, either by the original process or a completely different
process.
The nice thing about converting the hash to an array is that when I
read the file containing the array (i.e. the file created by passing
the array to PRIN1), it is loaded and recongized immediately as an
array object -- of course, I then have to convert the array to a hash
table to get back to the point I want.
So while I recognize that your suggestion will indeed print the hash
contents, what happens when I try to read that output back into memory?
I'll still need some kind of conversion function to get me back to the
original hash table object, correct?
> Although it's functional, this conversion method is slow: for example,
> even for hash tables containing around 25,000 entries, it takes several
> seconds to produce the array.
It seems likely to me that producing the array is very fast, but
printing it (especially if *PRINT-PRETTY* is T) will take some time.
What happens if you do something like:
(progn (hash-to-2D-array hash) nil)
Is it faster than just (hash-to-2D-array hash)?
Zach
Zach, that was brilliant!
You're right, the array conversion is fast, but calling PRIN1 with a
large array object causes the delay.
Although *PRINT-PRETTY* was set to T, I didn't think it mattered in
prod because I run all my complied code in -quiet mode.
So even without changing the default *PRINT-PRETTY* value (though I
need to research that a bit more), wrapping the PRIN1 call around
(progn ... nil) made a *huge* difference in performance.
Many thanks for pointing it out.
> The goal is to be able to serialize the hash table to a file, so that
> it can be read back into memory as the same hash table object at a
> later time, either by the original process or a completely different
> process.
[...]
> I'll still need some kind of conversion function to get me back to the
> original hash table object, correct?
just for serialization and deserialization - do you know:
http://common-lisp.net/project/cl-store/
okflo
--
Otto Karl Florian Diesenbacher
Perhaps you should read my previous post better...
--
__Pascal Bourguignon__ http://www.informatimago.com/
Nobody can fix the economy. Nobody can be trusted with their finger
on the button. Nobody's perfect. VOTE FOR NOBODY.
> I'll still need some kind of conversion function to get me back to the
> original hash table object, correct?
It's already in Pascal's code.
Well, ever since I saw this post of yours (and I'm sure you were
probably joking):
http://groups.google.com/group/comp.lang.lisp/browse_thread/thread/88a36fb2c239a44e/4c95766a8bdda347?lnk=gst&q=&rnum=63#4c95766a8bdda347
I'm less likely to read your replies very thoroughly.
Thanks for mentioning this -- I'll definitely try it.
This post was written to motivate, on the contrary, more VERY throughful reading.
--
__Pascal Bourguignon__ http://www.informatimago.com/
THIS IS A 100% MATTER PRODUCT: In the unlikely event that this
merchandise should contact antimatter in any form, a catastrophic
explosion will result.
Pascal's code provides this. It serializes the hash table to a list
which becomes a hash-table again when read.
I agree with you about treating Pascal's suggestions with some care,
occasionally stuff he says is crazy, but mostly he knows what he's
talking about.