Vincent Foley a écrit :
> Hello,
>
> For the past few days, I've been trying, unsuccessfully, to make an
> application I wrote faster. A Java program that performs, more or
> less, the same task takes 12 seconds (on my machine) to parse 1000
> files; my Clojure program takes nearly 3 minutes. This more than an
> order of magnitude slower! Using the profiling tools available with
> the JVM, I quickly determined which function was the costliest. I
> copied it into a simple script file to profile it in isolation. I
> have made the script and the profile results (long!) available at this
> URL: http://gist.github.com/82136
>
On my box nearly one half of the total time is sepent in the building of
arr (and by looking at your profiling data a good chunk of these traces
are related to this map call).
Better to get it out of the loop.
(let [arr (into-array Byte/TYPE (map byte [7 ; 1 Byte
3 0 97 0 98 0 99 ; 3 Shorts
0 0 100 100 ; 1 Integer
65 66 67 68 69 70 71 72 73 74
; 10 String
0 0 0 0 0 0 0 0 0 0 ; 10 ignored
]))
buf (ByteBuffer/wrap arr)]
(time
(dotimes [_ 10000]
(.position buf 0)
(run buf))))
Can you give the profile results for this code?
Christophe
--
Professional: http://cgrand.net/ (fr)
On Clojure: http://clj-me.blogspot.com/ (en)
Correction: it's the dispatching, not the computation of the dispatch
value that dominates.
Vincent Foley a écrit :
> Using the new versions of null-string and read-field-aux that you gave
> me, in my real application, the execution time went from 160 seconds
> to 150 seconds. As for using macros, I wrote one for the example
> program, but I realized that it wouldn't work in my application,
> because I sometimes use (apply parse-buffer buf field-vector).
>
I looked at your program, you can use macros there too.
Given compile-fields a function that outputs the code of a closure
taking a ByteBuffer as only argument then you can rewrite action as a
macro like this:
(defmacro action
[name & v-forms]
{:name name
:fields (compile-fields v-forms)}) and replace (apply parse-buffer
buf fields) with (fields buf).
Do you think it's feasible?
This reminds me... I would LOVE to see in Clojure something like
Erlang's bit-packing/destructuring syntax for working with bit
manipulation in a high-level way.
I'm actually casually hacking on something like this right now, as I
want to attempt to port a radar processing code I wrote into Clojure.
(It's currently implemented in modern C++ with a functional-ish style)
The WSR-88D file format is *extremely* bizarre and convoluted and
requires a lot of bit-fiddling. Half or more of the current code is
dedicated to converting this format into something sane. I actually hate
that part of my C++ implementation the most because it's all
hand-written unpacking and endianness-fixing of byte arrays which seems
impossible to clean up any further.
-So- .. What I actually have so far is a macro that lets you basically
write Erlang bit-syntax in S-exps and it will expand into a vector that
you can use as the binding list of a let form. The IP header unpacking
example in Erlang's documentation looks something like this:
(let (bits dgram
[ip-ver 4] [hdr-len 4] svc-type [tot-len 16]
[id 16] [flags 3] [frag-off 13]
ttl proto [hdr-cksum 16]
[src-ip 32] [dst-ip 32] [rest-dgram :binary])
'stuff)
Not shown: floats, signedness, and endianness (defaults are all the same
as Erlang).
It's "okay", but it looks unnatural and I don't really like doing it
that way very much. The source and destination names are backwards
(dgram is being destructured into ip-ver, hdr-len and so on), and the
syntax hijacks the whole let form, because I don't think you can write a
macro that automatically splices into its parent form.
So I had been thinking of composing an email about what the likelihood
of seeing extensibility API for destructuring would be. The idea being
that the UI for the bit-syntax library would allow for this custom
destructuring to be pervasive, look a little more conventional, and let
you mix it with regular binding forms (instead of having to use an
explicit let that is completely hijacked by the bit-syntax).
You could of course write something that would unpack bits into a
vector, and then destructure the vector on the left-hand side, but that
separates the names which will be bound (on the left) from their field
specifiers (passed to the macro on the right). I found this difficult to
read when the specifier list gets large, so I implemented it this way so
the bound names are next to their specifications (just like Erlang's).
Details: the macro expands into a bunch of sequential let bindings that
progressively tease apart the blob using some helper functions. I
haven't actually implemented the functions that do this teasing yet
because that requires me to go digging in Java API docs to see what they
provide for dealing with bit-bashing, and I haven't had the inclination
to do that just yet :)
So, is there any chance of extensible destructuring? What would such an
API look like? I have thought about it a lot, but the minutiae of doing
this generally enough to be useful, but simple enough to be
implementable are probably beyond my grasp.
-Kyle
-- Aaron
Ugh, I should have looked at your code before I sent that. There it
is on line 1. ;)
-- Aaron