I'm updating a program using threads so that it can use checkpoints instead
of often running for a month or more without the ability to restart anywhere
except at the beginning.
It will later be recompiled for multiple platforms, but for now I am only using
the Cygwin version of gcc under 64-bit Windows. Without checkpoints, it will
already compile for all the intended platforms, using other versions of gcc.
It uses the Posix threads method of getting threads (pthreads). However, I've
never used threads before. Also, I've never used Linux, and have done little
under Unix.
I've already found that my K&R 2 book does not mention pthreads in the
index.
What book would you recommend for learning enough about threads and
pthreads that I can do a reasonable job with this update? I'd also consider an
online class, but not an on-campus class.
> What book would you recommend for learning enough about threads and > pthreads that I can do a reasonable job with this update? I'd also > consider an online class, but not an on-campus class.
You should specify what kind of app you're programming. There are plenty of books, many are specialized.
<lucas.lev...@u-pec.fr> wrote:
>Le 28 octobre 2012, Robert Miles a écrit :
>> What book would you recommend for learning enough about threads and >> pthreads that I can do a reasonable job with this update? I'd also >> consider an online class, but not an on-campus class.
>You should specify what kind of app you're programming. There are plenty >of books, many are specialized.
I'd consider looking at "Advanced Programming in the UNIX Environment
(2nd Edition)".
<http://www.amazon.com/Advanced-Programming-UNIX-Environment-Edition/d...>
I only have the first edition which does not include threading but the
second edition has.
-- (\__/) M.
(='.'=) If a man stands in a forest and no woman is around
(")_(") is he still wrong?
Robert Miles wrote:
> I'm updating a program using threads so that it can use checkpoints
> instead of often running for a month or more without the ability to
> restart anywhere except at the beginning.
> It will later be recompiled for multiple platforms, but for now I am
> only using
> the Cygwin version of gcc under 64-bit Windows. Without checkpoints, it
> will
> already compile for all the intended platforms, using other versions of
> gcc.
> It uses the Posix threads method of getting threads (pthreads). However,
> I've
> never used threads before. Also, I've never used Linux, and have done
> little
> under Unix.
> I've already found that my K&R 2 book does not mention pthreads in the
> index.
Yes. As far as I know, the last edition of K&R dates back to 1988, and the POSIX threads standard was only published in 1995.
> What book would you recommend for learning enough about threads and
> pthreads that I can do a reasonable job with this update?
I doubt it is the best book written on the subject, but I found O'Reilly's pthreads programming book a good read to get up to speed on pthreads.
On Monday, October 29, 2012 3:45:15 AM UTC-5, Lucas Levrel wrote:
> Le 28 octobre 2012, Robert Miles a écrit :
> > What book would you recommend for learning enough about threads and > > pthreads that I can do a reasonable job with this update? I'd also > > consider an online class, but not an on-campus class.
> You should specify what kind of app you're programming. There are plenty > of books, many are specialized.
I'm updating an existing program rather than writing a new program.
It's related to RNA research and runs under BOINC. It needs to be able to
run on multiple types of platforms, including some with Windows and some with Linux.
It uses the easel and hmmer libraries as part of the source code.
On Monday, October 29, 2012 7:52:55 AM UTC-5, Rui Maciel wrote:
> Robert Miles wrote:
[snip]
> > What book would you recommend for learning enough about threads and
> > pthreads that I can do a reasonable job with this update?
> I doubt it is the best book written on the subject, but I found O'Reilly's > pthreads programming book a good read to get up to speed on pthreads.
> On Monday, October 29, 2012 3:45:15 AM UTC-5, Lucas Levrel wrote:
>> Le 28 octobre 2012, Robert Miles a crit :
>>> What book would you recommend for learning enough about threads and
>>> pthreads that I can do a reasonable job with this update? I'd also
>>> consider an online class, but not an on-campus class.
>> You should specify what kind of app you're programming. There are plenty
>> of books, many are specialized.
> I'm updating an existing program rather than writing a new program.
> It's related to RNA research and runs under BOINC.
Then, be aware that many books are more IT-oriented than science-oriented (in my opinion). If you need a primer on parallel programming, there are:
- The Art of Concurrency: introduces a classification of parallel algorithms, which may help you identifying what threads are used for;
- An Introduction to Parallel Programming: also deals with other forms of parallelism (MPI), but is science-oriented.
> It needs to be able to run on multiple types of platforms, including > some with Windows and some with Linux.
If it already uses Pthreads and compiles on these platforms, where's the worry? Just stick with the Pthread implementation of the threads concept.
On Monday, October 29, 2012 11:35:14 AM UTC-5, Lucas Levrel wrote:
> Le 29 octobre 2012, Robert Miles a écrit :
> > On Monday, October 29, 2012 3:45:15 AM UTC-5, Lucas Levrel wrote:
> >> Le 28 octobre 2012, Robert Miles a écrit :
> >>> What book would you recommend for learning enough about threads and
> >>> pthreads that I can do a reasonable job with this update? I'd also
> >>> consider an online class, but not an on-campus class.
> >> You should specify what kind of app you're programming. There are plenty
> >> of books, many are specialized.
> > I'm updating an existing program rather than writing a new program.
> > It's related to RNA research and runs under BOINC.
> Then, be aware that many books are more IT-oriented than science-oriented
> (in my opinion). If you need a primer on parallel programming, there are:
> - The Art of Concurrency: introduces a classification of parallel
> algorithms, which may help you identifying what threads are used for;
> - An Introduction to Parallel Programming: also deals with other forms of
> parallelism (MPI), but is science-oriented.
> > It needs to be able to run on multiple types of platforms, including
> > some with Windows and some with Linux.
> If it already uses Pthreads and compiles on these platforms, where's the
> worry? Just stick with the Pthread implementation of the threads concept.
There's no desire to have it run only on platforms that can run for a month or more without any interruptions. Adding checkpoints so that it can resume with less computer time lost due to any interruptions REQUIRES understanding threads so that I can suspend all threads except the main thread when writing checkpoints or restoring from them. I don't yet know if it also requires telling all the other threads to place information specific to those threads where the main thread can find it and include it in the checkpoint.
BOINC projects generally have budgets low enough that they need to ask volunteers for computer time on some of the computers less reliable at running for a month or more without any interruption, such as computers running Windows and without any UPS to help them through power flickers. Therefore, it is a very useful feature to have checkpoints that can frequently save the work in progress to a checkpoint, and resume from that checkpoint later if there is some interruption such as a power outage for a few seconds longer than the computer can run without power.
I don't expect any changes to be needed for the main purpose of the program, only the very many places I need to change for adding a checkpoint feature.
> On Monday, October 29, 2012 11:35:14 AM UTC-5, Lucas Levrel wrote:
>> Le 29 octobre 2012, Robert Miles a écrit :
>>> It needs to be able to run on multiple types of platforms, including
>>> some with Windows and some with Linux.
>> If it already uses Pthreads and compiles on these platforms, where's the
>> worry? Just stick with the Pthread implementation of the threads concept.
> There's no desire to have it run only on platforms that can run for a > month or more without any interruptions. Adding checkpoints so that it > can resume with less computer time lost due to any interruptions > REQUIRES understanding threads so that I can suspend all threads except > the main thread when writing checkpoints or restoring from them.
Of course. I was responding to "it needs to be able to run on multiple types of platforms". Isn't it already?
> I don't expect any changes to be needed for the main purpose of the program, only the very many places I need to change for adding a checkpoint feature.
Can you reuse any software library for the needed functionality "checkpointing"?
Robert Miles wrote:
> There's no desire to have it run only on platforms that can run for a
> month or more without any interruptions. Adding checkpoints so that it
> can resume with less computer time lost due to any interruptions REQUIRES
> understanding threads so that I can suspend all threads except the main
> thread when writing checkpoints or restoring from them. I don't yet know
> if it also requires telling all the other threads to place information
> specific to those threads where the main thread can find it and include it
> in the checkpoint.
The pthreads standard defines essentially a set of primitive operations, which are then put together to pull off whatever concurency tricks you wish to pull. This means that implementing the "checkpoint" feature you referred to isn't necessarily a pthreads issue; only a software design one.
You didn't said what you mean by "checkpoint". Without any specific details, it might be possible that it only take a single mutex to be able to implement your "checkpoints" feature. Nevertheless, this only depends on what your requirements are, and how you designed/will design your software.
Pthreads is orthogonal to this.
On Thursday, November 1, 2012 7:31:58 AM UTC-5, Markus Elfring wrote:
> > I don't expect any changes to be needed for the main purpose of the program, only the very many places I need to change for adding a checkpoint feature.
> Can you reuse any software library for the needed functionality "checkpointing"?
> Regards,
> Markus
I doubt it. Checkpointing is writing the current state of the program to a disk file, so that the program can restart from that state later if some interruption makes it necessary. Deciding what to write is very program-dependent; I do not expect any software library to do a decent job at deciding this.
It is REQUIRED that no other threads are still changing the state of the program while this is done.
At least under Windows, the memory will probably be rearranged when the program restarts, so just writing everything back to the same addresses in memory will not work - some of its previous memory is now likely to already be in use by some other program.
> On Monday, October 29, 2012 11:35:14 AM UTC-5, Lucas Levrel wrote:
>> If it already uses Pthreads and compiles on these platforms, where's the
>> worry? Just stick with the Pthread implementation of the threads concept.
Please trim your lines and tidy up the mess google makes of your replies!
> There's no desire to have it run only on platforms that can run for a month or more without any interruptions. Adding checkpoints so that it can resume with less computer time lost due to any interruptions REQUIRES understanding threads so that I can suspend all threads except the main thread when writing checkpoints or restoring from them. I don't yet know if it also requires telling all the other threads to place information specific to those threads where the main thread can find it and include it in the checkpoint..
Applications like BOINC typically dish out units of work to client machines and these in turn pass them on to worker threads. In this case, it makes sense for the unit of work to be scaled to be a) useful and b) small enough so not too much is wasted if the machine dies during processing.
Saving state part way through a work unit probably isn't worth while. Make your "checkpoints" the completion of a work unit.
On Friday, November 2, 2012 8:52:55 PM UTC-5, Ian Collins wrote:
> On 10/30/12 19:49, Robert Miles wrote:
> > On Monday, October 29, 2012 11:35:14 AM UTC-5, Lucas Levrel wrote:
> >> If it already uses Pthreads and compiles on these platforms, where's the
> >> worry? Just stick with the Pthread implementation of the threads concept.
> Please trim your lines and tidy up the mess google makes of your replies!
> > There's no desire to have it run only on platforms that can run for a
> > month or more without any interruptions. Adding checkpoints so that it
> > can resume with less computer time lost due to any interruptions REQUIRES
> > understanding threads so that I can suspend all threads except the main
> > thread when writing checkpoints or restoring from them. I don't yet know
> > if it also requires telling all the other threads to place information
> > specific to those threads where the main thread can find it and include it
> > in the checkpoint..
> Applications like BOINC typically dish out units of work to client > machines and these in turn pass them on to worker threads. In this > case, it makes sense for the unit of work to be scaled to be a) useful > and b) small enough so not too much is wasted if the machine dies during > processing.
> Saving state part way through a work unit probably isn't worth while.
> Make your "checkpoints" the completion of a work unit.
There's no good division of the process to allow the many of the workunits to
be less than a month long, and sometimes longer. If there was, checkpoints
would already be available.
There's also a much smaller type of workunits for the much shorter RNAs; these
already have checkpoints at the end of each section of RNA. However, this doesn't work for the larger RNAs, such as most of those found in humans and
other mammals. The nature of the problem just doesn't allow doing much with
arbitrarily chopped pieces of the RNA.
> On Friday, November 2, 2012 8:52:55 PM UTC-5, Ian Collins wrote:
>> On 10/30/12 19:49, Robert Miles wrote:
>>> On Monday, October 29, 2012 11:35:14 AM UTC-5, Lucas Levrel wrote:
>>>> If it already uses Pthreads and compiles on these platforms, where's the
>>>> worry? Just stick with the Pthread implementation of the threads concept.
>> Please trim your lines and tidy up the mess google makes of your replies!
> Trying.
>>> There's no desire to have it run only on platforms that can run for a
>>> month or more without any interruptions. Adding checkpoints so that it
>>> can resume with less computer time lost due to any interruptions REQUIRES
>>> understanding threads so that I can suspend all threads except the main
>>> thread when writing checkpoints or restoring from them. I don't yet know
>>> if it also requires telling all the other threads to place information
>>> specific to those threads where the main thread can find it and include it
>>> in the checkpoint..
>> Applications like BOINC typically dish out units of work to client
>> machines and these in turn pass them on to worker threads. In this
>> case, it makes sense for the unit of work to be scaled to be a) useful
>> and b) small enough so not too much is wasted if the machine dies during
>> processing.
>> Saving state part way through a work unit probably isn't worth while.
>> Make your "checkpoints" the completion of a work unit.
> There's no good division of the process to allow the many of the workunits to
> be less than a month long, and sometimes longer. If there was, checkpoints
> would already be available.
I still see this as a problem for the work unit rather than a threading one. Surly it would be easier for the work unit to know when would be a good time to dump its state rather than rely on an application wide mechanism? If the computations use multiple threads, each thread could save its own state.