Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

textscan only reading one line

671 views
Skip to first unread message

Kirsten

unread,
Sep 10, 2012, 4:17:11 PM9/10/12
to
I'm using textscan to read about 100 files containing 20000+ rows per file. I think I need to use textscan because it occasionally has text ('MIC') indicating an instrument problem in a column that is otherwise numeric. My program basically works but for about 10 of the files, it will read one line, then stops. It seems to be a problem with the newline characters (i.e. if I copy a row from one of the properly working files, then it will read two rows then stop, but it seems to be a problem in every row because I can't just delete a row from the file without it just reading the next row and stopping). I'm using a macbook, can anyone recommend a way to view these invisible characters and edit them appropriately? My code is:
Input=textscan(fid, '%f %s %s %f %f %f %*s %*s %*s %*s %*f %*s %*s', 'delimiter', ',M','MultipleDelimsAsOne',1, 'headerlines', 49, 'TreatAsEmpty', 'IC');

Kirsten

unread,
Sep 10, 2012, 4:59:08 PM9/10/12
to
I found Text Wrangler which allows me to view spaces and newline character and I can't see any difference between files that work and files that only read one line. I also tried to resume reading the file using the 'position' but it tells me
??? Error using ==> textscan
First input can not be empty.

Any ideas?

"Kirsten" wrote in message <k2lhs7$euo$1...@newscl01ah.mathworks.com>...

dpb

unread,
Sep 10, 2012, 5:38:26 PM9/10/12
to
On 9/10/2012 3:59 PM, Kirsten wrote:
> I found Text Wrangler which allows me to view spaces and newline
> character and I can't see any difference between files that work and
> files that only read one line. I also tried to resume reading the file
> using the 'position' but it tells me
> ??? Error using ==> textscan
> First input can not be empty.
...

I'd use the optional return argument to see what did read and where,
precisely the failure occurred (BTW, TMW, it would surely be
_a_good_thing_ (tm) ) if textscan had some more informative diagnostic
output on a failure so one isn't totally flying blind so much as in
cases as the above).

Use the dump tool to then look specifically there and see if that does
tell you anything.

I'm unable to tell from the original post--are these files both created
and read on the same platform or is there a case of moving them across?
No chance the ones that fail have something like that to distinguish them?

Which platform is/are being read on--Mac, I guess? There the 't' option
of fopen() shouldn't matter, but you can always try it for grins.

W/O the actual data files it's pretty much a guess from here as to
what's actually going on...

You might take a (short) snapshot of the the screen of one that works
and one that fails and post it; maybe somebody else's eyes will be
better and spot a 'gotcha'

--

Kirsten

unread,
Sep 10, 2012, 6:07:09 PM9/10/12
to
I'm not sure what you mean by the dump tool. It will tell me the position where it crashed (1037), but I'm unable to get it to pick back up at this spot (the documentation on this in the textscan help is pretty weak- why is it a string??).

[Input, position]=textscan(fid, '%f %s %s %f %f %f %*s %*s %*s %*s %*f %*s %*s', 'delimiter', ',M','MultipleDelimsAsOne',1, 'headerlines', 49, 'TreatAsEmpty', 'IC');

Here's a chunk of the bad file copied below the first 49 rows that are skipped (Note, I'm skipping everything but the first 6 columns):
1,29 Jun 2012,09:17:41,74.4,74.5,74.3,---,---,---,---,---, ,
2,29 Jun 2012,09:17:42,74.2,74.6,73.8,---,---,---,---,---, ,
3,29 Jun 2012,09:17:43,73.6,73.8,73.4,---,---,---,---,---, ,
4,29 Jun 2012,09:17:44,74.1,74.4,73.8,---,---,---,---,---, ,
5,29 Jun 2012,09:17:45,74.2,74.4,74.0,---,---,---,---,---, ,
6,29 Jun 2012,09:17:46,74.1,74.1,74.0,---,---,---,---,---, ,
7,29 Jun 2012,09:17:47,74.2,74.5,74.0,---,---,---,---,---, ,
8,29 Jun 2012,09:17:48,74.4,74.5,74.3,---,---,---,---,---, ,
9,29 Jun 2012,09:17:49,74.5,74.5,74.4,---,---,---,---,---, ,
10,29 Jun 2012,09:17:50,74.5,74.6,74.5,---,---,---,---,---, ,
11,29 Jun 2012,09:17:51,74.7,74.9,74.6,---,---,---,---,---, ,
12,29 Jun 2012,09:17:52,75.0,75.4,74.6,---,---,---,---,---, ,

Here's a chunk of a good file:
1,29 Jun 2012,09:11:24,73.5,73.5,73.3,---,---,---,---,73.5, ,
2,29 Jun 2012,09:11:25,73.4,73.5,73.1,---,---,---,---,73.4, ,
3,29 Jun 2012,09:11:26,73.0,73.1,72.9,---,---,---,---,73.0, ,
4,29 Jun 2012,09:11:27,74.2,74.7,73.1,114.9,---,---,---,74.2, ,
5,29 Jun 2012,09:11:28,74.1,74.4,73.7,105.4,---,---,---,74.1, ,
6,29 Jun 2012,09:11:29,74.1,74.3,73.9,115.8,---,---,---,74.1, ,
7,29 Jun 2012,09:11:30,75.8,76.4,74.4,112.1,---,---,---,75.8, ,
8,29 Jun 2012,09:11:31,74.6,75.4,74.2,---,---,---,---,74.6, ,
9,29 Jun 2012,09:11:32,74.1,74.2,73.9,---,---,---,---,74.1, ,
10,29 Jun 2012,09:11:33,73.6,73.9,73.5,---,---,---,---,73.6, ,
11,29 Jun 2012,09:11:34,73.2,73.5,73.0,---,---,---,---,73.2, ,
12,29 Jun 2012,09:11:35,72.7,73.0,72.5,---,---,---,---,72.7, ,
13,29 Jun 2012,09:11:36,72.6,72.6,72.5,---,---,---,---,72.6, ,

There are values in that second to last column in the good file, but that is somewhat random and I skip that column anyway.

The files were all saved from a program on a windows pc to a server location and I access the server location from my mac. So.. all the files are handled the same way, saved in the same location.

See anything?

TideMan

unread,
Sep 10, 2012, 7:11:40 PM9/10/12
to
I don't see how your call to textscan could have possibly worked, but this does:

fmt=['%f%s%s' repmat('%f',1,9)];
c=textscan(fid,fmt,...
'delimiter',',',...
'TreatAsEmpty','---');

0 new messages