Error with threads opening files

lance...@yahoo.com

unread,

Feb 27, 2010, 10:20:30 PM2/27/10

to

Hi all,

I'm using C and pthread on a Linux machine, and I'm having trouble
parallelizing a program.

I'm basically trying to take in a folder of data files, divide them
into groups, each group handled by a thread, and run a function on
each of the data file.

The way I'm doing this is I have a global char **filename variable,
where filename[i] = filename of a data file. In the main function,
I'll read in the filenames of all the data files (minus "." and "..")
using scandir and put them in the filename variable. Then 4 (arbitrary
number) threads are created each calling the Process function. In
Process(), each thread only opens (using a FILE *fin declared in
Process()) and works on a portion of the data files using a
start_index and an end_index. For example, if there are 100 files,
then each thread will handle filename[0] to filename[24], filename[25]
to filename[49], filename[50] to filename[74] and filename[75] to
filename[99] respectively. After they're done, there is a pthread_join
in main() for all 4 threads.

I have checked that the filenames have been stored correctly, both in
main() and Process(). However, I keep getting segmentation fault here,
in Process():

for (i = start_index, i <= end_index ; i++)
fin = fopen(filename[i], "rb"); <--- Seg fault

I don't really get why there should be an error since none of the
threads are trying to open the same file.

Please advise.

Thank you.

Regards,
Rayne

Ian Collins

unread,

Feb 27, 2010, 10:31:13 PM2/27/10

to

The real problem is probably in the code you haven't posted. Post a
(short if possible) complete example.

--
Ian Collins

Richard Heathfield

unread,

Feb 28, 2010, 3:41:04 AM2/28/10

to

lance...@yahoo.com wrote:
> [...] each thread only opens (using a FILE *fin declared in

> Process()) and works on a portion of the data files using a
> start_index and an end_index. For example, if there are 100 files,
> then each thread will handle filename[0] to filename[24], filename[25]
> to filename[49], filename[50] to filename[74] and filename[75] to
> filename[99] respectively. After they're done, there is a pthread_join
> in main() for all 4 threads.
>
> I have checked that the filenames have been stored correctly, both in
> main() and Process(). However, I keep getting segmentation fault here,
> in Process():
>
> for (i = start_index, i <= end_index ; i++)
> fin = fopen(filename[i], "rb"); <--- Seg fault

Given the tiny amount of information available, my best guess can only
be a guess:

s/<=/<//

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
"Usenet is a strange place" - dmr 29 July 1999
Sig line vacant - apply within

Nick

unread,

Feb 28, 2010, 5:52:04 AM2/28/10

to

Richard Heathfield <r...@see.sig.invalid> writes:

> lance...@yahoo.com wrote:
>> [...] each thread only opens (using a FILE *fin declared in
>> Process()) and works on a portion of the data files using a
>> start_index and an end_index. For example, if there are 100 files,
>> then each thread will handle filename[0] to filename[24], filename[25]
>> to filename[49], filename[50] to filename[74] and filename[75] to
>> filename[99] respectively. After they're done, there is a pthread_join
>> in main() for all 4 threads.
>>
>> I have checked that the filenames have been stored correctly, both in
>> main() and Process(). However, I keep getting segmentation fault here,
>> in Process():
>>
>> for (i = start_index, i <= end_index ; i++)
>> fin = fopen(filename[i], "rb"); <--- Seg fault
>
> Given the tiny amount of information available, my best guess can only
> be a guess:
>
> s/<=/<//

That looks a good guess. I've seen segfaults from fopen if a single
FILE * has been fclosed twice (no problem with the fclose).
--
Online waterways route planner | http://canalplan.eu
Plan trips, see photos, check facilities | http://canalplan.org.uk

Andre Ramaciotti

unread,

Feb 28, 2010, 8:16:54 AM2/28/10

to

Richard Heathfield <r...@see.sig.invalid> writes:

> lance...@yahoo.com wrote:
>> [...] each thread only opens (using a FILE *fin declared in
>> Process()) and works on a portion of the data files using a
>> start_index and an end_index. For example, if there are 100 files,
>> then each thread will handle filename[0] to filename[24], filename[25]
>> to filename[49], filename[50] to filename[74] and filename[75] to
>> filename[99] respectively. After they're done, there is a pthread_join
>> in main() for all 4 threads.
>>
>> I have checked that the filenames have been stored correctly, both in
>> main() and Process(). However, I keep getting segmentation fault here,
>> in Process():
>>
>> for (i = start_index, i <= end_index ; i++)
>> fin = fopen(filename[i], "rb"); <--- Seg fault
>
> Given the tiny amount of information available, my best guess can only be a
> guess:
>
> s/<=/<//

I'll make another guess: s/,/;/ after start_index.

Ben Bacarisse

unread,

Feb 28, 2010, 8:48:31 AM2/28/10

to

Andre Ramaciotti <andre.ra...@gmail.com> writes:

> Richard Heathfield <r...@see.sig.invalid> writes:
>
>> lance...@yahoo.com wrote:

<snip>

>>> for (i = start_index, i <= end_index ; i++)
>>> fin = fopen(filename[i], "rb"); <--- Seg fault
>>
>> Given the tiny amount of information available, my best guess can only be a
>> guess:
>>
>> s/<=/<//
>
> I'll make another guess: s/,/;/ after start_index.

Unlikely -- just because I don't recall ever seeing a compiler that
would accept the for loop as presented. It is more likely a typing
error. The other reason this seems likely is that the loop, as shown,
is pointless since it overwrites the same fin variable every time.
There is a lot of evidence that this is "something like" the code
being debugged.

To the OP: make sure you include program text using some automatic
method rather than typing it in. At the very least, I'd want to see
the code that sets up the filename data and the code that chooses the
start and end indexes. If it not the fencepost error that Richard
Heathfield has suggested, it is most likely something to do with the
filename data, but people here need to see the real code.

--
Ben.

Jonathan de Boyne Pollard

unread,

Mar 1, 2010, 5:47:19 AM3/1/10

to

>
>
> for (i = start_index, i <= end_index ; i++)
> fin = fopen(filename[i], "rb"); <--- Seg fault
>

This clearly isn't the actual code of your program, as others have
pointed out. And for diagnosing bugs we have tools known as debuggers.
Run your program under a debugger, and it will tell you what each thread
is actually doing at the time of the segmentation fault, and the values
of the various variables in each of the threads. If it then isn't
painfully obvious from the information provided by the debugger what the
problem is, as I suspect it will be, you can pass that information along
to other people.

Kenny McCormack

unread,

Mar 1, 2010, 11:50:27 AM3/1/10

to

In article <IU.D20100301.T1...@J.de.Boyne.Pollard.localhost>,
Jonathan de Boyne Pollard <J.deBoynePoll...@NTLWorld.COM>
accidently flitted (unaware that he was off-topic):

Debuggers are completely off-topic (OT) in comp.lang.c (CLC).

Furthermore, nobody here uses them.

Richard

unread,

Mar 1, 2010, 12:23:16 PM3/1/10

to

gaz...@shell.xmission.com (Kenny McCormack) writes:

Indeed. I have it on good authority from some c.l.c master that its FAR
easier to write it correctly first than it is to debug it
later. Debuggers merely encourage lax approach and poor adherence to
good development practises apparently. Quite a surprise to me who had
won various pay rises, bonuses etc based on time to productivity and
closure rate on new code bases I was asked to fix or
enhance. Apparently, according to c.l.c regs, you need a time machine to
go back and rewrite the 3 million lines you've been tasked with
maintaining 12 years into its lifespan PROPERLY but, hey, what do I
know? Me? I found it trivial to locate certain through points and set a
HW breakpoint for values which caused core dumps to then analyse the
backtrace and find where the problems began. Apparently thats all
bullshit and reading the code (all 5000 pages) while taking a dump would
have been FAR more productive.

Me? I think intelligent use of one makes you a far better analyst since
you can fast forward trap conditions and border conditions which are
more likely to affect the overall data flow and design of the
program. In addition to can watch the flow of the program and soon get a
good idea for the most common execution flows.

Lots of others agree too and the articles on using one are quite
persuasive. But then that would mean learning how to use one properly as
opposed to littering code with printfs like some Basic wielding pimply
teenager 20 odd years ago as is suggested by the c.l.c elite.

--
"Avoid hyperbole at all costs, its the most destructive argument on
the planet" - Mark McIntyre in comp.lang.c

Nick Keighley

unread,

Mar 2, 2010, 8:17:06 AM3/2/10

to

On 1 Mar, 17:23, Richard <rgrd...@gmail.com> wrote:
> gaze...@shell.xmission.com (Kenny McCormack) writes:
> > In article <IU.D20100301.T104733.P53605...@J.de.Boyne.Pollard.localhost>,
> > Jonathan de Boyne Pollard <J.deBoynePollard-newsgro...@NTLWorld.COM>

<snip>

> >>This clearly isn't the actual code of your program, as others have
> >>pointed out. And for diagnosing bugs we have tools known as debuggers.

> >>[...]

Richard ??? has a rather biased view of these matters. Checks clc's
archives for other points of view if you care.