Thanks for your response.
You said that you were able to compile the program, but that it won't run. Are you saying that it won't run, even with 24 cores, or fewer?
I should point out that, a successful run, of this program, means that nothing will print out. So, if you run the program, and all you receive is a return prompt, a few seconds later, then that means that it ran successfully. Both programs only print something if something goes wrong.
You asked why I've chosen not to use the team feature of OpenCoarrays. I've chosen not to use it because, the last time I checked, their implementation of teams is not fully functional. You're allowed to create teams, but you're limited in the degree, to which, you can use them. If I remember correctly, OpenCoarrays is not yet written so that you can use the square bracket notation to transfer information between teams. For instance, if you write a program that has a statement like "a[1, team_number=2] = n", the program won't compile. The Coarray Fortran Compiler will give you an error message, and doesn't recognize "team_number=2". It treats it like a syntax error.
Because of this, I've written the program, in such a way, that I should be able to use the teams concept, with only using coarrays, and no teams.
I understand how the program may seem confusing. However, it mostly boils down to being able to envision how the teams concept may assign a team number to each image. In this program, I imagine it assigning values, in a systematic way -- like filling a two-demensional matrix. You could imagine the images being assigned team numbers, by filling a matrix that has 'N' columns. It fills this matrix, from left to right, top to bottom, the following way:
image1, image2, image3, image4, ..., imageN
imageN+1, imageN+2, imageN+3, imageN+4 ..., image2*N
image2*N+1, image2*N+2, image2*N+3, image2*N+4, ..., image3*N
...
This is with "N" being the number of teams. It fills an imaginary matrix, this way, until it runs out of images. Each column, of this imagined matrix, represents the images that belong to a team. The first column has all images of team 1, the second column has all images that are in team 2, and so on. So image1, imageN+1, and image 2*N+1, would all be in team 1. image2,
imageN+2, and image2*N+2, would all be in team 2, and so on.
I understand not being up to the task to try to understand the intricacies of someone else's program. I'm exactly the same way. Unless I wrote it, I usually don't like trying to debug it, or follow it. In any case, hopefully the imagined matrix, will help to make it seem less daunting.
You're right that each image will work at its own rate. For this reason, the programs use the "syncimlist" function. That function will wait until all of the images, in its list, reach the same line, in the program, before moving forward. The program makes sure that all of the images, that it's going to work with, have made all of their changes, before attempting to access any of those remote images.
For me, the biggest piece of evidence that the issue is with MPI and the cluster, is the fact that it works, with no problem, on a single machine. I can run either of these programs, all day long, on a single machine, using 24 images, and never have any issues.
Thanks, again, for taking the time to help. As always, I truly appreciate all of your input.