Reading from a file, or, Haskell hates me

62 views
Skip to first unread message

William Tracy

unread,
Nov 7, 2012, 7:54:07 PM11/7/12
to baha...@googlegroups.com
Hello,

Here's a trivial Haskell program that should dump randomfile.txt to stdout:

import System.IO

readEntry :: Handle -> IO String
readEntry handle = do
  content <- hGetContents handle
  --putStrLn content
  return content

main = do
  text <- withFile "randomfile.txt" ReadMode readEntry
  putStrLn text


When I run this code, I get one blank line of output. Now, if I uncomment the putStrLn call inside of readEntry, I get the contents of randomfile.txt printed out *twice*.

As far as I can tell, Haskell is not evaluating "content <- hGetContents handle" until I call putStrLn on content. Meanwhile, withFile conveniently closes the input file before putStrLn gets called in main.

So, is there something I am doing here that is fundamentally wrong, or is lazy evaluation just plain stupid and buggy and I should just give up on Haskell already? Seriously, this is ridiculous.


William Tracy
afish...@gmail.com
(408) 685-4819

Jung Ko

unread,
Nov 7, 2012, 8:05:35 PM11/7/12
to baha...@googlegroups.com
This seems to work for me:

main :: IO ()
main = readFile "randomfile.txt" >>= putStrLn

Hope that helps,

Jung

--
You received this message because you are subscribed to the Google Groups "Bay Area Haskell Users Group" group.
To post to this group, send email to baha...@googlegroups.com.
To unsubscribe from this group, send email to bahaskell+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Johan Tibell

unread,
Nov 7, 2012, 8:07:00 PM11/7/12
to baha...@googlegroups.com
Hi William,


On Wed, Nov 7, 2012 at 4:54 PM, William Tracy <afish...@gmail.com> wrote:
>
> When I run this code, I get one blank line of output. Now, if I uncomment the putStrLn call inside of readEntry, I get the contents of randomfile.txt printed out *twice*.
>
> As far as I can tell, Haskell is not evaluating "content <- hGetContents handle" until I call putStrLn on content. Meanwhile, withFile conveniently closes the input file before putStrLn gets called in main.
>
> So, is there something I am doing here that is fundamentally wrong, or is lazy evaluation just plain stupid and buggy and I should just give up on Haskell already? Seriously, this is ridiculous.


The getContents should come with a big warning. Right now the docs read:

"The getContents operation returns all user input as a single
string, which is read lazily as it is needed (same as hGetContents
stdin). "

You've correctly identified the problem: lazy I/O lazily a bad idea.
The base library should never had included those functions, but that
ship has unfortunately sailed as it's part of the published H98
standard and lots of code already uses those functions.

You can do one of two things here, either read the whole file strictly
(using Data.ByteString.hGetContents or Data.ByteString.readFile) or
explicitly read it chunk-wise, using e.g. Data.ByteString.hGet.

Our base I/O layer is really in need of a face-lift.

Cheers,
Johan

William Tracy

unread,
Nov 7, 2012, 9:08:53 PM11/7/12
to baha...@googlegroups.com
Thanks, Johan.

I'm working on a batch program that reads in a bunch of files, reformats them, and writes them out to a different directory. I was trying to structure the program with one function that reads the files, and another that writes the files.

After kicking around some ideas, I've finally gone with one function that does all the reading and writing, and delegates the reformatting to another (pure) function. It's not quite the way I originally wanted to structure the code, but it lets me keep the benefits of lazy I/O (low memory footprint, etc.).

Thanks for letting me bounce this off of you guys. :-)

Johan Tibell

unread,
Nov 7, 2012, 9:14:39 PM11/7/12
to baha...@googlegroups.com
Hi William,

On Wed, Nov 7, 2012 at 6:08 PM, William Tracy <afish...@gmail.com> wrote:
> After kicking around some ideas, I've finally gone with one function that
> does all the reading and writing, and delegates the reformatting to another
> (pure) function. It's not quite the way I originally wanted to structure the
> code, but it lets me keep the benefits of lazy I/O (low memory footprint,
> etc.).

For future reference, if you do want to use lazy I/O functions (like
readFile) but makes sure all the content get ready (strictly, thus
requiring O(n) space) you can use:

f = do
s <- readFile "foo"
evaluate (length s) -- Forces the whole string to be read.
-- use s

Although at that point you probably just want to use a strict readFile
instead (but this trick is still useful to understand if you insist on
lazy I/O).

-- Johan

Myles C. Maxfield

unread,
Nov 7, 2012, 9:15:13 PM11/7/12
to baha...@googlegroups.com
Conduits might suit your needs, though there is a somewhat steep learning curve.

--Myles

Johan Tibell

unread,
Nov 7, 2012, 9:20:14 PM11/7/12
to baha...@googlegroups.com
On Wed, Nov 7, 2012 at 6:15 PM, Myles C. Maxfield
<myles.m...@gmail.com> wrote:
> Conduits might suit your needs, though there is a somewhat steep learning
> curve.

As Myles points out, conduits is a (new/experimenta;) approach to I/O
in Haskell. The Haskell I/O APIs are currently evolving. At the
bottom* we have a familiar (although a bit crufty) stream like
abstraction (Handle), that you're probably used to from imperative
languages. The lack of resource safety and poor composability of that
abstraction has led to people (including myself) experimenting with
different, more high-level, solutions to the I/O problem, such as
conduits and iteratees.

* There's an even lower level layer in e.g. System.Posix.IO that gives
you (non-portable) access to the raw system calls.

-- Johan

Shachaf Ben-Kiki

unread,
Nov 7, 2012, 9:24:51 PM11/7/12
to baha...@googlegroups.com
On Wed, Nov 7, 2012 at 5:07 PM, Johan Tibell <johan....@gmail.com> wrote:
It's worth noting that your original program would've worked "as
expected" if you'd used openFile instead of withFile.

The way lazy I/O (hGetContents etc.) works is by putting a handle in a
special "semi-closed" state. When it reads the whole file, the handle
is closed automatically, but if you close the file manually before
then, the string is cut off. withFile closes the file automatically,
so the string winds up empty unless you look at it before that
happens. Such are the evils of unsafeInterleaveIO.

Shachaf

William Tracy

unread,
Nov 11, 2012, 4:40:50 AM11/11/12
to baha...@googlegroups.com
For anyone who's curious, the project I was banging my head against has reached a usable state. :-)

I'm building a little static blog generator: It reads a set of posts as simple text files, and outputs a set of HTML files (with auto-generated navigation!) that you can FTP to your web host. Nothing particularly cool, but it's a nice learning exercise.

A sample blog is live here: http://www.wtracy.com/blog/

The code is here: https://github.com/wtracy/hablog

All the footer and header HTML is hard-coded in. At some point I need to add support for templating. (Anyone who is bored is welcome to fork my code and give it a go!)

Anyway, I wanted to say thanks for the help, and that I haven't given up on Haskell yet. :-)
Reply all
Reply to author
Forward
0 new messages