Simultaneously reading two text files line by line

984 views
Skip to first unread message

Heng Li

unread,
Nov 17, 2012, 10:04:37 PM11/17/12
to mi...@dartlang.org
The dart:io tutorial shows how to read one text file line by line. Now I want to read two files at the same time. Here is a minimal Perl script I want to achieve in Dart:

open(F1, $ARGV[0]) || die; # open file 1
open(F2, $ARGV[1]) || die; # open file 2
while (my $l1 = <F1>) { # read one line from file 1
my $l2 = <F2>; # read one line from file 2
print "[file 1] $l1"; # print the file 1 line
print "[file 2] $l2" if $l2; # print the file 2 line unless reaching end-of-file
}
close(F1); close(F2); # close

It reads two files in turn and print them out. How to do the same in Dart with StringInputStream?

I apologize if the solution is obvious or someone has asked. I am new to Dart.

Thanks in advance.

Christopher Wright

unread,
Nov 19, 2012, 2:19:07 PM11/19/12
to General Dart Discussion
Since Dart IO is asynchronous, this is troublesome. StringInputStream doesn't buffer and doesn't allow you to peek ahead, so I'd probably wrap it in something that can buffer at least one line of text. Then I could write a callback that each stream calls in onLine that will print the next available line from each file.

An annoying amount of code compared to perl, which was intended for dealing with streams of text, but nothing conceptually difficult.


--
Consider asking HOWTO questions at Stack Overflow: http://stackoverflow.com/tags/dart
 
 

Ladislav Thon

unread,
Nov 19, 2012, 11:43:05 PM11/19/12
to mi...@dartlang.org

There are methods for synchronous dealing with files and for scripting purposes, I would never hesitate to use them. Heck, I wouldn't even hesitate to load the whole file to memory at first.

Async is hard, let's go shopping.

Well, it doesn't necessarily have to be hard, but I won't miss this opportunity to say that dart:io looks like it's specifically made to make it as hard as possible :-)

LT

Mads Ager

unread,
Nov 20, 2012, 2:21:18 AM11/20/12
to General Dart Discussion
It does indeed not have to be hard at all.  :-)

import 'dart:io';

printAndSwap(s1, s2) {
  print(s1.readLine());
  if (!s2.closed) {
    s1.onLine = null;
    s2.onLine = () => printAndSwap(s2, s1);
  }
}

main() {
  var f1 = new File('file.dart');
  var f2 = new File('file2.dart');
  var s1 = new StringInputStream(f1.openInputStream());
  var s2 = new StringInputStream(f2.openInputStream());
  s1.onLine = () => printAndSwap(s1, s2);
}

Cheers,     -- Mads



--

Heng Li

unread,
Dec 1, 2012, 10:41:39 AM12/1/12
to mi...@dartlang.org
Sorry for replying after 10 days. The solution seems limited to me. What if I want to open n files, where n is the number of files on the command line? What if I need more complex operations than simply printing the two files?

Why not provide "FILE*" equivalent in Dart in the first place? I can imagine the Dart way has advantages in some cases, but for most other cases, the complication of file reading is a show-stopper for me and arguably for many perl/python/ruby programmers for whom text processing is part of the daily work. Note that I usually work with files over 10GB. Loading the file to RAM is not an option.

I will send a feature request. Thank you for all the responses.

Heng

Florian Loitsch

unread,
Dec 3, 2012, 6:43:28 AM12/3/12
to General Dart Discussion, Anders Johnsen
Fwiw this should work nicely with the streaming API.

var files = ['file1', 'file2', 'file3'];
var streams = files.mappedBy((path) {
  return new File(path).openForRead().transform(new LineSplitter());
var mergedStream = new MergingStream.cyclic(streams);
mergedStream.subscribe(onData: print);

The LineSplitter has already been written and will definitely find its place in the IO library (although maybe under a different name).
The MergingStream does not yet exist, but we want to provide some useful Stream manipulation classes, and this one or a similar one probably makes sense.
--
Give a man a fire and he's warm for the whole day,
but set fire to him and he's warm for the rest of his life. - Terry Pratchett

Heng Li

unread,
Dec 3, 2012, 3:41:22 PM12/3/12
to mi...@dartlang.org, Anders Johnsen
This is interesting, but what if I only read the second file when information read from file 1 meets certain requirements, and vice versa? Note that I am not making up a use case. In my field, I need fine controlled alternate file reading when merging genomic regions.

I am not sure what is wrong with C-like stream in the first place (i.e. we get a file handler and then we can fully control reading). It should be easy to wrap C-stream-like APIs to implement the dart:io APIs, but to do the reverse seems more difficult. I know we can achieve all C-stream functionality with dart:io if we think harder, but is the complication really necessary? The different dart:io APIs will also alienate many programmers who do text processing.

Btw, I have sent a feature request at: <http://code.google.com/p/dart/issues/detail?id=7084>.

Heng

Ladislav Thon

unread,
Dec 3, 2012, 4:01:15 PM12/3/12
to mi...@dartlang.org
I am not sure what is wrong with C-like stream in the first place (i.e. we get a file handler and then we can fully control reading). It should be easy to wrap C-stream-like APIs to implement the dart:io APIs, but to do the reverse seems more difficult. I know we can achieve all C-stream functionality with dart:io if we think harder, but is the complication really necessary? The different dart:io APIs will also alienate many programmers who do text processing.

Here's a thing: Dart IO is primarily asynchronous, which is great for scalable servers, but obviously terrible for scripting purposes. You are rightfully complaining about async IO, because it really is a bad fit for you. But note that you can use synchronous file IO in Dart. Actually, you can rewrite you C code pretty straightforwardly using File.openSync and then RandomAccessFile.*Sync methods. Something like this:

main() {
  var fileNames = new Options().arguments;
  var openFiles = fileNames.map((fileName) => new File(fileName).openSync(FileMode.READ));
  var buf = new List(65536);
  outerLoop: while (true) {
    for (var i = 0; i < openFiles.length; i++) {
      var openFile = openFiles[i];
      var read = openFile.readListSync(buf, 0, buf.length);
      if (i == 0) break outerLoop;
      print("[$i] $buf");
    }
  }
}

If I understand correctly, this should be an equivalent of the C code you posted in the feature request. I didn't try to run it, but it should give you a hint :-)

LT

Ladislav Thon

unread,
Dec 3, 2012, 4:06:04 PM12/3/12
to mi...@dartlang.org
main() {
  var fileNames = new Options().arguments;
  var openFiles = fileNames.map((fileName) => new File(fileName).openSync(FileMode.READ));
  var buf = new List(65536);
  outerLoop: while (true) {
    for (var i = 0; i < openFiles.length; i++) {
      var openFile = openFiles[i];
      var read = openFile.readListSync(buf, 0, buf.length);
      if (i == 0) break outerLoop;
      print("[$i] $buf");
    }
  }
}


Closing open files is left as an excersise for careful reader, as well as error handling etc. :-)

LT

Heng Li

unread,
Dec 3, 2012, 7:39:45 PM12/3/12
to mi...@dartlang.org
Thanks for the example. I did not think of openSync() because I saw it returns "RandomAccessFile" while a stream cannot be randomly accessed. Anyway, on ordinary files, I can use readListSync() to implement a C-like fgets() to read a line. "File" only provides methods to read the entire file as lines. It really should have a method to read a single line.

However, one problem still remains. 'new File("/dev/stdin").openSync()' does not work. I know dart:io has 'stdin', but its type is 'inputStream' which is asynchronized. Is this intended or a bug?

Thanks,

Heng
Reply all
Reply to author
Forward
0 new messages