Read a certain number of lines in readdlm ?

717 views
Skip to first unread message

Jarvist Moore Frost

unread,
Aug 12, 2014, 10:40:24 AM8/12/14
to julia...@googlegroups.com

I’m writing a Julia parser to read in a large output from a Fortran program which is essentially a load of concatenated matrices of differing dimensions. It would be really useful to be able to do something along the lines of readdlm(file,nlines=3) to pull in i.e. the 3x3 matrix you know that follows.

Currently I’m resorting to things like:

    celltext=string(readline(f),readline(f),readline(f))
    cell=readdlm(IOBuffer(celltext))

And this really doesn’t feel like a very elegant method (not helped as neither readline nor readlines appear to accept ‘number of lines’ as an argument).

Am I missing the Julia way to do things here? Or should I start writing @macros to expand to this level of nitty gritty?

Iain Dunning

unread,
Aug 12, 2014, 2:14:40 PM8/12/14
to julia...@googlegroups.com
No need for macros!
Its an interesting feature request, maybe open a Github issue so people can discuss it.

I think your solution is not terrible, you could generalize it to

readcell(f, nlines) = readdlm(IOBuffer(string([readline(f) for i in 1:nlines])))

Then do something like

f = open("mydata","r")
cells = {}
while !eof(f)
  push!(cells, readcell(f, 3))
end
close(f)

Jameson Nash

unread,
Aug 12, 2014, 2:33:01 PM8/12/14
to julia...@googlegroups.com
As a slight optimization, you could note that string works by creating an IOBuffer and printing the arguments into it, and then converting the result to a string. Thus, you could skip the extra conversion to a string and back by making the IOBuffer directly. 

Jarvist Moore Frost

unread,
Aug 13, 2014, 7:08:19 AM8/13/14
to julia...@googlegroups.com

Thank you both!
However, the forming a string with string([readline(STDIN) for i in 1:2]) leads to a type of "Union(ASCIIString,Array{Char,1},UTF8String)[\"1\\n\",\"2\\n\"]" the escaped white space formatting then follows through into the eventual readdlm object (i.e. fields aren’t properly interpreted).

So the working code I have is much more nasty, temporary objects and all kinds of cludge:

function readnlines(f,n)
    local lines=""
    local i=1
    for i=1:n
        lines=lines*readline(f)
    end
    return (lines)
end

readmatrix(f, nlines) = readdlm(IOBuffer(readnlines(f,nlines)))

I think expanding a macro @readnlines(f,nlines)(readline(f))^nlines) might be more elegant, but I don’t know whether a massive string*string*...string object is efficient to evaluate.

Certainly in general I think a readnlines function is useful.
So would having line ranges in readdlm - currently it supports a ‘skipstart’ option (not documented?) - making this a full line-range object would be nice.

https://github.com/JuliaLang/julia/blob/454344fcea17021cb6ca5687d0a9f41daedd7e9e/base/datafmt.jl#L252

- readdlm in the Julia source

I also found this discussion from last-year on julia-dev, after trying for a while to use the multidimensional / tuple format form of readdlm, I read the source & decided that they were probably talking about prospective changes, not realised ones! (-:

https://groups.google.com/d/msg/julia-dev/PpSy2NQmkG0/cl67UWJec4QJ

Jameson Nash

unread,
Aug 13, 2014, 10:59:32 AM8/13/14
to julia...@googlegroups.com
Since `string` prints it's arguments, it was missing a ... to work as intended:
string([readline(f) for i in 1:nlines]...)

Since you wanted an IOBuffer anyways, I would use the following approach, which avoids lots of garbage creation over repeatedly calling `string` (aka `*`):
function readnlines(f,n) lines = IOBuffer() for i = 1:n write(lines, readline(f)) end return lines # or takebuf_string(lines) if you didn't want an IOBuffer end

Reply all
Reply to author
Forward
0 new messages