>> > [...]
>>
>> Hm, this is definitely not expected. Usually errors like this occur when
>> the data transfer between the workers and the MATLAB client is truncated
>> or corrupted in some way.
>>
> Hey Edric, thanks a lot for the reply, but I really tried my best to
> create a self-contained code that reproduces this error, but failed,
> because it actually involves calling the external software COMSOL.
>
> Although COMSOL is involved, I still believe this error comes from
> MATLAB parallel, because once I change parfor to normal for, it runs
> without any errors for days.
>
> By the way, I am on school's HPC, which means that the several workers
> may span over several nodes. Does that matter? After all, it works for
> hours before this error pops up,
Are you using an interactive parallel pool to do this, or is everything
running on the cluster inside e.g. a 'batch' job? If you are using an
interactive pool, it might be worth trying a 'batch' job instead as then
there will be no communication from your host to the remote cluster.
If you haven't used it before, the batch reference page is here:
<
http://www.mathworks.com/help/distcomp/batch.html>
and you'll want to do something like
c = parcluster(...); % get your HPC cluster
j = batch(c, @myFunction, 2, {args}, 'Pool', 15);
where 'myFunction' contains your PARFOR loops etc.
Cheers,
Edric.