Reading in output of NSTask with a lot of data

360 views
Skip to first unread message

Colin Wheeler

unread,
Nov 2, 2009, 11:18:50 AM11/2/09
to CocoaHeads DM
Hey Everybody,

I am encountering either a bug in NSTask or something odd. Basically I have many NSTasks in my app and up till now I have launched them and read them in just fine. However now I am starting to launch git NSTasks which are capable of outputting a large amount of data. It appears that if you configure an NSPipe object as the standard output of the NSTask and then launch it, it hangs when you get to the command [task waitUntilExit]; and the subprocess just sits there hanging. If I do a sample in Activity monitor all the processes that exhibit this behavior are hanging on their respective write api. 

I've attached a Sample Project that shows this behavior in action. All it does is attempt to use /bin/cat to spit out the contents of a simple text file with garbage sample data and then read in the result and insert it as text in a NSTextView. I've noticed that if you force quit the subprocess of the app that it immediately goes on inside the test app and shows you the string that it has (same thing happens in my app), though it's very clear it didn't reach the end of the data.

Does anybody have any suggestions on how to work around this or get the output in chunks of data that I can eventually use to get all the output?

Colin Wheeler
"No fair! You changed the outcome by measuring it!" - Professor Farnsworth (Futurama)
TaskTester.zip

Jim Turner

unread,
Nov 2, 2009, 11:59:34 AM11/2/09
to des-moines...@googlegroups.com
Just because one defines NSTask's standard output to a pipe doesn't
mean the pipe is ready to read. Init'ing a pipe just provides a
mechanism to which one can chain another command. Until you get a file
handle from that pipe, it doesn't know what you want to do. In
effect, your code is doing this:

/bin/cat test.txt |

Run that in the command line and you'll get the same effect as your
program. The OS is waiting for something to read that data.

So you need to give is something: [outPipe fileHandleForReading]

I've updated your program and reattached it. Note that on line #23,
we are asking for a file handle from the pipe. Even though we've not
started the NSTask, the pipe is ready to read data from it. You no
longer need to waitUntilExit as NSFileHandle's readDataToEndOfFile
will do that for you.

Note that with massive amounts of data (like hundreds of gigs), you'll
not want to do readDataToEndOfFile but instead
readToEndOfFileInBackgroundAndNotify as it'll block the thread you're
on until it's done. Or if the task will generate streamed data, look
at readInBackgroundAndNotify and continuously reading in
availableData.

Jim
TaskTester.zip

Colin Wheeler

unread,
Nov 2, 2009, 12:13:11 PM11/2/09
to des-moines...@googlegroups.com
*Sigh* Of course just after posting to this list do only then stumble onto a bit of documentation that solves my problem. It looks like if instead of doing a single read for the available data if you do this instead

NSTask *task = [[NSTask alloc] init];

NSPipe *outPipe = [NSPipe pipe];

NSData *inData = nil;

NSFileHandle *readHandle = [outPipe fileHandleForReading];

[task setLaunchPath:@"/bin/cat"];

NSString *dirPath = [[[NSBundle mainBundle] pathForResource:@"test" ofType:@"txt"] stringByDeletingLastPathComponent];

[task setCurrentDirectoryPath:dirPath];

[task setArguments:[NSArray arrayWithObject:@"test.txt"]];

[task setStandardOutput:outPipe];

[task launch];

//[task waitUntilExit];

//NSMutableString *result = [[NSMutableString alloc] init];

while ((inData = [readHandle availableData]) && [inData length]) {

        NSString *tempString = [[NSString alloc] initWithData:inData encoding:NSASCIIStringEncoding];

[outputView insertText:tempString];

    }

//NSString *result = [[NSString alloc] initWithData:[[outPipe fileHandleForReading] readDataToEndOfFile] encoding:NSASCIIStringEncoding];

//[outputView insertText:result];


it works just perfectly. Jim to begin with this I was doing this on a background thread similar to

//NSOperationQueue *queue = //

[queue addOperationWithBlock:^{
    NSTask *task = //

   [task launch];

  NSString *output = //NSTasks output

    [[NSOperationQueue mainQueue] addOperationWIthBlock:^{
         [textView insertText:output];
    }];
}];

Something about your code, jim, though on first impression, looks like you could be doing a read of the data there even though the process may still be spitting out data to the end which Is why I was doing the wait for the process to finish to make absolutely sure it had finished writing to stdout so that I could then take that and read it in. 

Colin Wheeler
"No fair! You changed the outcome by measuring it!" - Professor Farnsworth (Futurama)


Jim Turner

unread,
Nov 2, 2009, 12:25:42 PM11/2/09
to des-moines...@googlegroups.com
> Something about your code, jim, though on first impression, looks like you
> could be doing a read of the data there even though the process may still be
> spitting out data to the end

Nope. readDataToEndOfFile assures that even if the data isn't all
there yet that it'll get it before returning. The docs say "...if a
communications channel, until an end-of-file indicator is returned."
which is how pipes signal they are done. readDataToEndOfFile just
keeps calling readDataOfLength: for you until EOF is encountered.

The while loop in your code has a flaw in that if the data is
segmented for whatever reason, you'll iterate through the loop again
for the next chunk of data... which will create a new NSString, which
updates the text view... you see the problem. Either append the data
to a mutable string or, in the case with data that isn't massive, just
readDataToEndOfFile :)

Jim

Colin Wheeler

unread,
Nov 2, 2009, 1:29:52 PM11/2/09
to des-moines...@googlegroups.com
I just did that for this test example. What I ended up doing in my own library was just that, calling readDataToEndOfFile. I guess I don't know why I put the waitUntilExit] in there except that one day I thought this could lead to getting incomplete data back :\

I don't know where you read that, but all I saw was this, I guess this is why I thought I had to wait for the task to complete launching:

readDataToEndOfFile

Returns the data available through the receiver up to the end of file or maximum number of bytes.

- (NSData *)readDataToEndOfFile

Return Value

The data available through the receiver up to UINT_MAX bytes (the maximum value for unsigned integers) or, if a communications channel, until an end-of-file indicator is returned.

Discussion

This method invokes readDataOfLength: as part of its implementation.

Availability
  • Available in Mac OS X v10.0 and later.
See Also
Related Sample Code
Declared In
NSFileHandle.h

It didn't say anything about the task actually completing before reading the data. Anyway thanks for the help :)

Colin Wheeler
"No fair! You changed the outcome by measuring it!" - Professor Farnsworth (Futurama)


Jim Turner

unread,
Nov 2, 2009, 2:33:17 PM11/2/09
to des-moines...@googlegroups.com
> It didn't say anything about the task actually completing before reading the
> data.

Ahh, I see the confusion... You can be guaranteed the data is complete
because of its source. 'cat' outputs the entire file in one shot by
default. That's the old Unix admin in me just piecing the parts
together. NSPipe/NSFileHandle don't really have anything to do with
the buffering of the data in this case.

'cat -u' might have some different effects when used with
readDataToEndOfFile, but who uses -u with cat anyways?

Jim

Colin Wheeler

unread,
Nov 2, 2009, 3:01:56 PM11/2/09
to des-moines...@googlegroups.com
Well really I am using this with git commands which cocoagit doesn't provide access for just yet. I was using cat here because it was quick to make an example using it. I've implemented this new code in Gitty and it's working fine now :)

Colin Wheeler
"No fair! You changed the outcome by measuring it!" - Professor Farnsworth (Futurama)


Reply all
Reply to author
Forward
0 new messages