[erlang-questions] Slow line oriented IO

Hynek Vychodil

unread,

Feb 14, 2013, 9:46:34 AM2/14/13

to erlang-questions

Hello,
I know it was been already discussed here in list and it is also recurring
topic for at least five years. But anyway I have been bitten by it again and
also found pretty pathological case. I have 30MB text file and it has a few
near to 1MB lines there. (I can provide file with same line lengths if
somebody interested.) What I have been observed is that reading this file
using raw file:read_line/1 takes 51s! For comparison I have tried some
different approaches and what I got (line_read_test:read_std/1 is using
file:read_line/1):

1> timer:tc(line_read_test,read_std,["test.txt"]).
{51028105,2408}
2> timer:tc(line_read_test,read,["test.txt"]).
{226220,2408}
3> timer:tc(line_read_test,read_port,["test.txt"]).
{139388,2408}

$time perl -nE'$i++}{say $i' test.txt
2408

real 0m0.053s
user 0m0.044s
sys 0m0.008s

$ time wc -l test.txt
2408 test.txt

real 0m0.013s
user 0m0.004s
sys 0m0.008s

$ time ./a.out test.txt
2408

real 0m0.020s
user 0m0.012s
sys 0m0.008s

It means erlang should be at least 225 times faster (line_read_test:read/1
which has flow control). Erlang can be 350 times faster
(line_read_test:read_port/1 without flow control). Another high level
language (perl) is almost thousand times faster. Special C program is almost
four thousands times faster and old good glibc is two and half thousands
times faster. Come on guys it is not even fun when simple (and wrong) erlang
wrapper around standard module is more than two order of magnitude faster.
From mine experience when there is something two orders of magnitude slower
it tells me there is something damn wrong. I have looked into efile_drv.c and
ti is unfortunately far beyond mine C skill but if simple buffering and
binary:match/2 can outperform it 225 times there has to be something rotten
in this code.

I have also experimented with read_ahead option in file:open and changing to
less value makes thing worse.

Just to make grasp how bad it is, in same time I'm able sort 150 million
64bit values (1.2GB of data) three times (one CPU core same HW). It is not in
flow control, mine simple wrapper does flow control too. It can't make
current code less intrusive, if it consumes 100% CPU for 51s instead of 226ms
then it will definitely affect whole server. It is not in concurrent access,
mine code allows concurrent access too. Admitting there is something broken
is first step to fixing it. I hope I helped at least in this way.

With best regards
Hynek Vychodil

line_read_test.erl

line_read_test.c

Lukas Larsson

unread,

Feb 14, 2013, 10:25:30 AM2/14/13

to Hynek Vychodil, erlang-questions

Hello!

If you start erlang with an async thread (erl +A 1) you will see radically different behaviour. For me the time it takes to use read_line drops from 58s to 2s. When doing file i/o you should always have some async threads to help you do the work, this is why in R16B we changed the default from 0 to 10 async threads.

Lukas

_______________________________________________
erlang-questions mailing list
erlang-q...@erlang.org
http://erlang.org/mailman/listinfo/erlang-questions

Joe Armstrong

unread,

Feb 14, 2013, 4:40:41 PM2/14/13

to Hynek Vychodil, erlang-questions

I wrote a little program to test this

-module(test1).

-compile(export_all).

test1(N) ->

timer:tc(?MODULE, make, [N]).

test2() ->

timer:tc(?MODULE, read, []).

make(N) ->

Line = lists:seq(30,59) ++ [$\n],

file:write_file("big.tmp",

lists:duplicate(N,Line)).

read() ->

{ok,B} = file:read_file("big.tmp"),

L = split(B, [], []),

length(L).

split(<<10,B/binary>>, L1, L2) ->

split(B, [], [list_to_binary(lists:reverse(L1))|L2]);

split(<<H,T/binary>>, L1, L2) ->

split(T, [H|L1], L2);

split(<<>>, L1, L2) ->

lists:reverse([list_to_binary(lists:reverse(L1))|L2]).

The timing I got were that Erlang was 9 times slower than perl (or wc)

which is more or less what I expected. If I wanted to speed this up

I'd write a NIF to split the binary at the first newline character.

I actually always use file:read_file(F) for everything - since getting the

entire file in at one go always seems a good idea and I have small files

(compared to my RAM) - I'd use file:pread for files that are too big for memory and do random access read. Reading the entire file seems

a good idea for files less than 100MB since I have 4GB of memory.

The OS seems to do a better job of caching entire files than I could ever

do so I don't worry about re-reading them ...

I have no idea why you see a factor of 250 - is this a memory problem.

How much memory have you got? Does your program scale linearly with

the file size - or does something go suddenly wrong as you increase the

size of the file?

Cheers

/Joe

On Thu, Feb 14, 2013 at 3:46 PM, Hynek Vychodil <vychodi...@gmail.com> wrote:

Hynek Vychodil

unread,

Feb 14, 2013, 7:28:24 PM2/14/13

to Joe Armstrong, erlang-questions

I think it is misunderstanding. There is not problem with erlang io
for normal files as you tested. Problem is with little bit unusual
files with few long lines as I attached. It causes some pathological
behavior in prim_file:read_line/1. It perfectly works for example for
another 3GB file with 16 million lines which are all evenly long about
198B. It can process it in nice two minutes which is pretty (29MB/s
and 148klines/s) but for another hundred times smaller file with seven
thousand less lines it takes half of this time which is terribly wrong
(0.5MB/s and 46lines/s). So I'm not pointing that erlang is bad for
processing files generally but there is some nasty bug in
prim_file:read_line/1 function.

test.txt.bz2

Hynek Vychodil

unread,

Feb 14, 2013, 7:29:35 PM2/14/13

to Lukas Larsson, erlang-questions

Using async thread drops time down to 4.6s which is still 20 times
slower than mine simple wrapper.

Lukas Larsson

unread,

Feb 15, 2013, 10:42:48 AM2/15/13

to Hynek Vychodil, erlang-questions

aha, you needed to have very long lines in the file to provoke the behaviour. The file I was experimenting on had the same line length all the time.

I think I've found the culprit. Aparently the read_ahead option was ignored after the first buffer had been consumed. Try the patch below and see if it works for you:

From 0884480691e047801cef117b19151a7d34c9e438 Mon Sep 17 00:00:00 2001
From: Lukas Larsson <lu...@erlang-solutions.com>
Date: Fri, 15 Feb 2013 16:37:52 +0100
Subject: [PATCH] invoke_read_line now remembers read_ahead settings

---
erts/emulator/drivers/common/efile_drv.c |    2 ++
1 file changed, 2 insertions(+)

diff --git a/erts/emulator/drivers/common/efile_drv.c b/erts/emulator/drivers/common/efile_drv.c
index 2279fec..d81f116 100644
--- a/erts/emulator/drivers/common/efile_drv.c
+++ b/erts/emulator/drivers/common/efile_drv.c
@@ -1160,6 +1160,8 @@ static void invoke_read_line(void *data)
            /* Need more place */
            ErlDrvSizeT need = (d->c.read_line.read_size >= DEFAULT_LINEBUF_SIZE) ?
                d->c.read_line.read_size + DEFAULT_LINEBUF_SIZE : DEFAULT_LINEBUF_SIZE;
+         if (need < (d->c.read_line.binp)->orig_size)
+           need = (d->c.read_line.binp)->orig_size;
            ErlDrvBinary   *newbin = driver_alloc_binary(need);
            if (newbin == NULL) {
                d->result_ok = 0;
--
1.7.10.4

Lukas Larsson

unread,

Feb 15, 2013, 10:58:49 AM2/15/13

to Hynek Vychodil, erlang-questions

Oh, and here are the benchmarks for me now:

R15B03:

1> timer:tc(line_read_test,read_std,["test.txt"]).

{21307874,14448}

R16B with patch:

1> timer:tc(line_read_test,read_std,["test.txt"]).

{91885,14448}

It's not every day you find a two line change which increases speed by a factor of 232!

Lukas

Hynek Vychodil

unread,

Feb 15, 2013, 6:57:39 PM2/15/13

to Lukas Larsson, erlang-questions

It works like charm. I have patched R15B03 ad with 1MB read ahead it
is unbelievable fast

20> timer:tc(line_read_test,read_std,["test.txt"]).
{30732,2408}

It is even four times faster than port version

24> timer:tc(line_read_test,read_port,["test.txt"]).
{135676,2408}

It is also faster without async threads. Measurements above was with
+A 10. Without async threads:

2> timer:tc(line_read_test,read_std,["test.txt"]).
{19322,2408}

I was expected time around 100ms but current behavior is far beyond
expectation. I hope this patch will find way into upcoming R16. It
seems to speed up line oriented io for "normal" files two times also.

Thanks a lot
Hynek Vychodil

On Fri, Feb 15, 2013 at 4:58 PM, Lukas Larsson

Joe Armstrong

unread,

Feb 16, 2013, 3:11:39 PM2/16/13

to Lukas Larsson, erlang-questions

On Fri, Feb 15, 2013 at 4:58 PM, Lukas Larsson <lu...@erlang-solutions.com> wrote:

Oh, and here are the benchmarks for me now:

R15B03:

1> timer:tc(line_read_test,read_std,["test.txt"]).
{21307874,14448}

R16B with patch:

1> timer:tc(line_read_test,read_std,["test.txt"]).
{91885,14448}

It's not every day you find a two line change which increases speed by a factor of 232!

Lukas

Great - well patched

/Joe

Joe Armstrong

unread,

Feb 16, 2013, 3:14:36 PM2/16/13

to Hynek Vychodil, Lukas Larsson, erlang-questions

On Sat, Feb 16, 2013 at 12:57 AM, Hynek Vychodil <vychodi...@gmail.com> wrote:

It works like charm. I have patched R15B03 ad with 1MB read ahead it
is unbelievable fast

20> timer:tc(line_read_test,read_std,["test.txt"]).
{30732,2408}

It is even four times faster than port version

24> timer:tc(line_read_test,read_port,["test.txt"]).
{135676,2408}

It is also faster without async threads. Measurements above was with
+A 10. Without async threads:

2> timer:tc(line_read_test,read_std,["test.txt"]).
{19322,2408}

I was expected time around 100ms but current behavior is far beyond
expectation. I hope this patch will find way into upcoming R16. It
seems to speed up line oriented io for "normal" files two times also.

Thanks a lot
Hynek Vychodil

And thanks to you - if we don't know it's broken we can't fix it - so all observations of

weird behaviour are worthy of note.

Cheers

/Joe

Lukas Larsson

unread,

Feb 18, 2013, 8:25:47 AM2/18/13

to Hynek Vychodil, erlang-questions

Great! All tests over the weekend seem to have passed for the patch so it will be part of R16B.

Lukas

Reply all

Reply to author

Forward