Spitting out a lazy seq to file???

3,951 views
Skip to first unread message

Thomas

unread,
Aug 16, 2011, 11:26:09 AM8/16/11
to Clojure
Hi everyone,

I have been struggling with this, hopefully, simple problem now for
quite sometime, What I want to do is:

*) read a file line by line
*) modify each line
*) write it back to a different file

This is a bit of sample code that reproduces the problem:

==========================================================
(def old-data (line-seq (reader "input.txt")))

(defn change-line
[i]
(str i " added stuff"))

(spit "output.txt" (map change-line old-data))
==========================================================
#cat "output.txt"
clojure.lang.LazySeq@58d844f8

Because I get the lazy sequence I think I have to force the execution?
but where
exactly? And how?

Thanks in advance!!!

Thomas

Ken Wesson

unread,
Aug 16, 2011, 11:51:44 AM8/16/11
to clo...@googlegroups.com

The spit function expects a string; you need to pr-str the object.
However, that will output it like the REPL would: (line1 line2 line3
...)

You probably want no parentheses, and separate lines. So you'll want
something more like

(with-open [w (writer-on "output.txt")]
(binding [*out* w]
(doseq [l (map change-line old-data)]
(println l))))

The output part is lazy now, so you might want to consider making the
input part lazy as well:

(with-open [r (reader-on the-input-file)
w (writer-on "output.txt")]
(binding [*out* w]
(doseq [l (line-seq r)]
(println (change-line l)))))

(note: untested, and assumes suitable reader-on and writer-on
functions such as from contrib)

Then it will be able to process files bigger than can be held in main
memory all at once.

--
Protege: What is this seething mass of parentheses?!
Master: Your father's Lisp REPL. This is the language of a true
hacker. Not as clumsy or random as C++; a language for a more
civilized age.

Meikel Brandmeyer

unread,
Aug 16, 2011, 12:06:57 PM8/16/11
to clo...@googlegroups.com
Hi,

spit does not take a sequence of lines. Hence you'll have to do something like this:

(->> old-data
(map change-line)
(interpose \newline)
(apply str)
(spit "output.txt"))

Sincerely
Meikel

Rasmus Svensson

unread,
Aug 16, 2011, 12:35:40 PM8/16/11
to clo...@googlegroups.com
2011/8/16 Thomas <th.van...@gmail.com>:

spit operates on strings, so therefore you have to turn your data into
one big string first. (spit implicitly calls str on its argument, but
this is not very useful.)

What you have is a sequence of string, so depending of what you want
to appear in the file you have multiple options:

For an arbitrary clojure data structure you can use pr-str to convert
it into the same format you get at the repl: (spit "output.txt"
(pr-str (map change-line old-data))). This can be useful to dump some
data to a file and will yield something like this:

("line1 added stuff" "line2 added stuff")

To simply write a file where each line corresponds to a string element
in the sequence, you can either build a new string with consisting of
the strings of the seq, each with a newline character appended to the
end, concatenated together and spit that, or you can use something
else that doesn't require you to build this monolithic string. Since
you used line-seq rather than slurp to read in the file, I will
instead demonstrate an other approach than spit:

(require '[clojure.java.io :as io])

(with-open [in (io/reader input-filename)
out (io/writer output-filename)]
(binding [*out* out]
(->> in (line-seq) (map change-line) (map println) (dorun))))

This consumes the sequence line by line and writes the lines to the
file. This solution only needs to have one line in memory at a time.
The spit approach would require one big string to be constructed, and
might not be very suited for big files. The code would output a file
like this:

line1 added stuff
line2 added stuff

So in general, use slurp together with spit or read-line (or its
line-seq variant) together with println.

// raek

Armando Blancas

unread,
Aug 16, 2011, 12:42:29 PM8/16/11
to Clojure
You can put the line break back into each line (" added stuff\n") and
then do:
(spit "output.txt" (reduce str (map change-line old-data)))
Reply all
Reply to author
Forward
0 new messages