or to increase a counter while reducing it, a function like inc+ returning {:sum sum :count count} and then take the sum/counter, which is the mean.
The problem is possible to state as a clean map-reduce problem with only one traversing of the data. It's also possible to remove items form the mean operation (ie, the problem is associative).
> -- > You received this message because you are subscribed to the Google > Groups "Clojure" group. > To post to this group, send email to clojure@googlegroups.com > Note that posts from new members are moderated - please be patient with > your first post. > To unsubscribe from this group, send email to > clojure+unsubscribe@googlegroups.com > For more options, visit this group at > http://groups.google.com/group/clojure?hl=en
> The obvious way is like the following, which traverse the sequence 2 times.
> Wondering what will be the efficient way...
(defn avg [coll]
(loop [c coll tot 0 cnt 0]
(if (empty? c)
(/ tot cnt)
(recur (rest c) (+ tot (first c)) (inc cnt)))))
This will loop only once.
It happens to be faster, but the difference is not explained by (count
coll) elimination only. Maybe reduce is a factor as well.
This reduce function is somewhat unusual in that its arguments have
different forms. As a result, this one does require the initial-value
argument be used. It's set to [0 0] indicating the sum and count both
start at 0. The function then "consumes" the numbers in coll one at a
time, producing the running sum and count each time. Then we just
apply / to divide the sum by the count.
On Mar 28, 9:18 pm, "simon.T" <simon.j....@gmail.com> wrote:
On Thu, Mar 29, 2012 at 12:18 AM, simon.T <simon.j....@gmail.com> wrote: > The obvious way is like the following, which traverse the sequence 2 times. > ...
The obvious way does not necessarily traverse the sequence twice. If a sequence S satisfies the 'counted?' predicate, (count S) takes constant time. In particular user=> (counted? [:a :b :c]) true
So it is very likely that (/ (reduce + coll) (count coll)) will not traverse 'coll' twice, and the natural way is the preferred way. <Insert standard warning about premature optimization./>
> On Thu, Mar 29, 2012 at 12:18 AM, simon.T <simon.j....@gmail.com> wrote:
> > The obvious way is like the following, which traverse the sequence 2 times.
> > ...
> The obvious way does not necessarily traverse the sequence twice. If
> a sequence S satisfies the 'counted?' predicate, (count S) takes
> constant time. In particular
> user=> (counted? [:a :b :c])
> true
> So it is very likely that (/ (reduce + coll) (count coll)) will not
> traverse 'coll' twice, and the natural way is the preferred way.
> <Insert standard warning about premature optimization./>
"Very likely" strikes me as a huge overstatement here. Most sequences
that you want to average won't be source-code literals, they'll be
lazy sequences, and those aren't counted.
> "Very likely" strikes me as a huge overstatement here. Most sequences > that you want to average won't be source-code literals, they'll be > lazy sequences, and those aren't counted
Point taken about lazy sequences. But the above was not intended to suggest the sequence needs to be source code literal to satisfy 'counted?', rather that vectors, lists, maps, and sets do so. That covers a fair bit of ground.
On Mar 29, 10:43 pm, Alan Malloy <a...@malloys.org> wrote:
> "Very likely" strikes me as a huge overstatement here. Most sequences
> that you want to average won't be source-code literals, they'll be
> lazy sequences, and those aren't counted.
I think this topic is interesting. My guess would be, that the
sequence would have been traversed completely by the reduce call and
therefore clojure could know it's size and provide a constant time
count.
I too think this is interesting because because it serves to illustrate some important general aspects of Clojure with a very simple problem.
I wrote four Clojure programs contrasting different ways of solving the problem, and then timed the application of each ten times to a million-item sequence /mill-float-numbs/ of floating-point random numbers. Here are the interesting results:
> I think this topic is interesting. My guess would be, that the > sequence would have been traversed completely by the reduce call and > therefore clojure could know it's size and provide a constant time > count.
On Thu, 2012-03-29 at 23:31 -0700, Benjamin Peter wrote: > the sequence would have been traversed completely by the reduce call > and therefore clojure could know it's size and provide a constant time > count.
> Could this be implemented?
Yes. You could probably implement it yourself, as a wrapper sequence type, though this wouldn't make all other sequence types automatically countable.
> Is it?
It won't be, in general, because you would have to "hold onto the head" until you got to the end, or go through a costly half-broken weak reference dance.
It is worth considering that even for some uncounted sequences, the cost of a second traversal for "count" may be less than the bookkeeping cost of keeping a count as you traverse once.
-- Stephen Compall ^aCollection allSatisfy: [:each|aCondition]: less is better
> This reduce function is somewhat unusual in that its arguments have > different forms. As a result, this one does require the initial-value > argument be used. It's set to [0 0] indicating the sum and count both > start at 0. The function then "consumes" the numbers in coll one at a > time, producing the running sum and count each time. Then we just > apply / to divide the sum by the count.
> On Mar 28, 9:18 pm, "simon.T" <simon.j....@gmail.com> wrote: > > The obvious way is like the following, which traverse the sequence 2 > times. > > Wondering what will be the efficient way...
I can understand the first one being "slow" but I'm a bit surprised about the loop being the slowest of the four options. Can someone shed some light on that?
Nice to see the accumulating reduce being faster since that's how I've settled in to solving this kind of problem in our code, without really wondering about performance (it's "fast enough" and I think it's the more elegant solution). -- Sean A Corfield -- (904) 302-SEAN An Architect's View -- http://corfield.org/ World Singles, LLC. -- http://worldsingles.com/
"Perfection is the enemy of the good." -- Gustave Flaubert, French realist novelist (1821-1880)
I have installed Leiningen not so much to manage projects but to enable use of /clojure-jack-in/ as a means of getting Swank and Slime to work.
And they do work for me. But now I have a question that I can't find an answer for in any Leiningen documentation I know about. I have a largish, previously created group of /clj/ local files that exist on my system and whose function definitions I want to use in my further work.
Is there a way to specify such local files in a project :dependencies declaration -- or do I have to do a /load-file/ on each of them once I get a Slime REPL running for the project (or maybe insert them all into the /src/core.clj/ file of the project and do a single /load-file/ on it)?
That question is surely complicated enough to indicate how badly confused I am! Thanks for help.
--Larry
On Sat, Apr 28, 2012 at 12:21 AM, Larry Travis <tra...@cs.wisc.edu> wrote:
> I have installed Leiningen not so much to manage projects but to enable use
> of clojure-jack-in as a means of getting Swank and Slime to work. And they
> do work for me. But now I have a question that I can't find an answer for
> in any Leiningen documentation I know about. I have a largish, previously
> created group of clj local files that exist on my system and whose function
> definitions I want to use in my further work. Is there a way to specify
> such local files in a project :dependencies declaration -- or do I have to
> do a load-file on each of them once I get a Slime REPL running for the
> project (or maybe insert them all into the src/core.clj file of the project
> and do a single load-file on it)?
Once possibility is to put all those files into a project together,
let's called it utilities and assume it has a 0.0.1-SNAPSHOT version.
Then you can run 'lein install' and it'll package them up and put them
in your local repository.
Then, wherever you want to use them in another project, just declare a
dependency on [utilities "0.0.1-SNAPSHOT"] and Leiningen will
automatically pull them in (from your local repo).
Then you just use/require the namespace(s) containing the relevant functions.
Sean,
Your advice makes good sense, but I can't make it work. Per that advice, I paste some of my function definitions into the core.clj file of the project "prjctOne" and proceed thusly:
larrytravis$ lein install
Copying 1 file to /Users/larrytravis/prjctOne/lib
No namespaces to :aot compile listed in project.clj.
Created /Users/larrytravis/prjctOne/prjctOne-1.0.0-SNAPSHOT.jar
Wrote pom.xml
[INFO] Installing /Users/larrytravis/prjctOne/prjctOne-1.0.0-SNAPSHOT.jar to /Users/larrytravis/.m2/repository/prjctOne/prjctOne/1.0.0-SNAPSHOT/prjctOne -1.0.0-SNAPSHOT.jar
But when I create a new project prjctTwo, I can't figure out what dependency declaration for it gives a prjctTwo Slime REPL access to the functions in prjctOne. That is, what would correspond to the [utilities "0.0.1-SNAPSHOT"] vector in your example?
--Larry
> On Sat, Apr 28, 2012 at 12:21 AM, Larry Travis<tra...@cs.wisc.edu> wrote:
>> I have installed Leiningen not so much to manage projects but to enable use
>> of clojure-jack-in as a means of getting Swank and Slime to work. And they
>> do work for me. But now I have a question that I can't find an answer for
>> in any Leiningen documentation I know about. I have a largish, previously
>> created group of clj local files that exist on my system and whose function
>> definitions I want to use in my further work. Is there a way to specify
>> such local files in a project :dependencies declaration -- or do I have to
>> do a load-file on each of them once I get a Slime REPL running for the
>> project (or maybe insert them all into the src/core.clj file of the project
>> and do a single load-file on it)?
> Once possibility is to put all those files into a project together,
> let's called it utilities and assume it has a 0.0.1-SNAPSHOT version.
> Then you can run 'lein install' and it'll package them up and put them
> in your local repository.
> Then, wherever you want to use them in another project, just declare a
> dependency on [utilities "0.0.1-SNAPSHOT"] and Leiningen will
> automatically pull them in (from your local repo).
> Then you just use/require the namespace(s) containing the relevant functions.
On Sat, Apr 28, 2012 at 2:41 PM, Larry Travis <tra...@cs.wisc.edu> wrote:
> Created /Users/larrytravis/prjctOne/prjctOne-1.0.0-SNAPSHOT.jar
...
> prjctOne. That is, what would correspond to the [utilities
> "0.0.1-SNAPSHOT"] vector in your example?
Sean:
Your suggestion doesn't work. The Slime REPL comes up fine when I use the dependency vector you suggest (some of the vectors I have tried prevent the Swank server from starting), but the REPL doesn't know anything about the functions defined in prjctOne or about the name-space in which they exist.
So I guess for the nonce I'll just use load-file commands in the Slime REPL to make use of my local stash of clj files, and I'll worry some time later about getting Leiningen to construct dependencies on those files. I'd rather spend my time on getting my programs to work than on getting my programming environment to work.
If you get any more ideas about how I might try to solve my problem, let me know. Thanks.
--Larry
> On Sat, Apr 28, 2012 at 2:41 PM, Larry Travis<tra...@cs.wisc.edu> wrote:
>> That is, what would correspond to the [utilities
>> "0.0.1-SNAPSHOT"] vector in your example?
> Try [prjctOne "1.0.0-SNAPSHOT"]
leiningen relies on maven dependency resolution...
the dependency entry is of the form
[groupId/artifactId "version"]
You have the groupId and the artifactId both set to prjctOne. You can tell
this from the path it's installed into your local repo. I believe this is
default behaviour if you specify a simple project name in project.clj (i.e
one without a /)
This is the key entry in your logging:
[INFO] Installing /Users/larrytravis/prjctOne/**prjctOne-1.0.0-SNAPSHOT.jar
to /Users/larrytravis/.m2/**repository/prjctOne/prjctOne/**1.0.0-
SNAPSHOT/prjctOne-1.0.0-**SNAPSHOT.jar
On Sun, Apr 29, 2012 at 12:42 AM, Larry Travis <tra...@cs.wisc.edu> wrote:
> Sean:
> Your suggestion doesn't work. The Slime REPL comes up fine when I use the
> dependency vector you suggest (some of the vectors I have tried prevent the
> Swank server from starting), but the REPL doesn't know anything about the
> functions defined in prjctOne or about the name-space in which they exist.
> So I guess for the nonce I'll just use load-file commands in the Slime
> REPL to make use of my local stash of clj files, and I'll worry some time
> later about getting Leiningen to construct dependencies on those files.
> I'd rather spend my time on getting my programs to work than on getting my
> programming environment to work.
> If you get any more ideas about how I might try to solve my problem, let
> me know. Thanks.
> --Larry
> On 4/28/12 5:07 PM, Sean Corfield wrote:
>> On Sat, Apr 28, 2012 at 2:41 PM, Larry Travis<tra...@cs.wisc.edu> wrote:
>>> That is, what would correspond to the [utilities
>>> "0.0.1-SNAPSHOT"] vector in your example?
>> Try [prjctOne "1.0.0-SNAPSHOT"]
> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clojure@googlegroups.com
> Note that posts from new members are moderated - please be patient with
> your first post.
> To unsubscribe from this group, send email to
> clojure+unsubscribe@**googlegroups.com<clojure%2Bunsubscribe@googlegroups.c om>
> For more options, visit this group at
> http://groups.google.com/**group/clojure?hl=en<http://groups.google.com/group/clojure?hl=en>
> leiningen relies on maven dependency resolution...
> the dependency entry is of the form
> [groupId/artifactId "version"]
> You have the groupId and the artifactId both set to prjctOne. You can > tell this from the path it's installed into your local repo. I believe > this is default behaviour if you specify a simple project name in > project.clj (i.e one without a /)
> This is the key entry in your logging:
> [INFO] Installing > /Users/larrytravis/prjctOne/prjctOne-1.0.0-SNAPSHOT.jar to > /Users/larrytravis/.m2/repository/prjctOne/prjctOne/1.0.0-SNAPSHOT/prjctOne -1.0.0-SNAPSHOT.jar
> On Sun, Apr 29, 2012 at 12:42 AM, Larry Travis <tra...@cs.wisc.edu > <mailto:tra...@cs.wisc.edu>> wrote:
> Sean:
> Your suggestion doesn't work. The Slime REPL comes up fine when I
> use the dependency vector you suggest (some of the vectors I have
> tried prevent the Swank server from starting), but the REPL
> doesn't know anything about the functions defined in prjctOne or
> about the name-space in which they exist.
> So I guess for the nonce I'll just use load-file commands in the
> Slime REPL to make use of my local stash of clj files, and I'll
> worry some time later about getting Leiningen to construct
> dependencies on those files. I'd rather spend my time on getting
> my programs to work than on getting my programming environment to
> work.
> If you get any more ideas about how I might try to solve my
> problem, let me know. Thanks.
> --Larry
> On 4/28/12 5:07 PM, Sean Corfield wrote:
> On Sat, Apr 28, 2012 at 2:41 PM, Larry
> Travis<tra...@cs.wisc.edu <mailto:tra...@cs.wisc.edu>> wrote:
> That is, what would correspond to the [utilities
> "0.0.1-SNAPSHOT"] vector in your example?
> Try [prjctOne "1.0.0-SNAPSHOT"]
> -- > You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clojure@googlegroups.com
> <mailto:clojure@googlegroups.com>
> Note that posts from new members are moderated - please be patient
> with your first post.
> To unsubscribe from this group, send email to
> clojure+unsubscribe@googlegroups.com
> <mailto:clojure%2Bunsubscribe@googlegroups.com>
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en
On Mon, Apr 30, 2012 at 9:28 PM, Larry Travis <tra...@cs.wisc.edu> wrote:
> Phil, Neale, Sean:
> You guys are all way ahead of me as to why I am getting the results I am
> getting, but it is only Neale's advice that works. That is
> [prjctOne/prjctOne "1.0.0-SNAPSHOT"] works, but
> [prjctOne "1.0.0-SNAPSHOT"] does not.
Interesting. I've never seen that behaviour before; it sounds like a bug.
I'm trying to reproduce this problem here but am unable to. Can you
provide steps to reproduce it locally? What version of Leiningen is
this?
Phil:
I now can't get the behavior to reproduce either. I have no idea what kind of dumb mistake I was making in the first place, and I'm very sorry to have wasted your time. (For what it's worth, both dependency-vector versions work in my reproduction attempts -- but you probably already knew that they would!)
--Larry
> On Mon, Apr 30, 2012 at 9:28 PM, Larry Travis<tra...@cs.wisc.edu> wrote:
>> Phil, Neale, Sean:
>> You guys are all way ahead of me as to why I am getting the results I am
>> getting, but it is only Neale's advice that works. That is
>> [prjctOne/prjctOne "1.0.0-SNAPSHOT"] works, but
>> [prjctOne "1.0.0-SNAPSHOT"] does not.
> Interesting. I've never seen that behaviour before; it sounds like a bug.
> I'm trying to reproduce this problem here but am unable to. Can you
> provide steps to reproduce it locally? What version of Leiningen is
> this?
On Wed, May 2, 2012 at 5:44 PM, Larry Travis <tra...@cs.wisc.edu> wrote:
> I now can't get the behavior to reproduce either. I have no idea what kind
> of dumb mistake I was making in the first place, and I'm very sorry to have
> wasted your time. (For what it's worth, both dependency-vector versions work
> in my reproduction attempts -- but you probably already knew that they
> would!)
That makes me feel better because I couldn't understand why my
suggestion didn't work :)
Glad it's working for you now - welcome to the wonderful world of Leiningen!
-- Sean A Corfield -- (904) 302-SEAN
An Architect's View -- http://corfield.org/ World Singles, LLC. -- http://worldsingles.com/
"Perfection is the enemy of the good."
-- Gustave Flaubert, French realist novelist (1821-1880)