I/O and Barriers

Jeremy Bejarano

unread,

May 4, 2012, 10:45:50 PM5/4/12

to mpi...@googlegroups.com

So, I think I might be missing something or perhaps I'm doing this the wrong way. I thought that calling barrier will make all processes wait until they all reached the same point. However, I've noticed that sometimes it doesn't work the way I expect. I run the following program

from mpi4py import MPI

comm = MPI.COMM_WORLD

print "1st wave"

comm.Barrier()

print "2nd wave"

and expect that all the first prints should clear before the program moves on. For the most part, it works but sometimes I get the following:

-bash-3.2$ mpiexec -n 4 python26 parallel.py

1st wave

2nd wave

1st wave

2nd wave

Is this actually correct? Is this an I/O problem with python?

Thanks for the help in advance. :)

Jeremy

Aron Ahmadia

unread,

May 5, 2012, 8:10:19 AM5/5/12

to mpi...@googlegroups.com

Interesting, I suspect that the print command is not flushing output from stdout because it is buffering for performance reasons (see http://docs.python.org/using/cmdline.html#cmdoption-u).

Can you try:

import sys

from mpi4py import MPI

comm = MPI.COMM_WORLD

print "1st wave"

sys.stdout.flush()

comm.Barrier()

print "2nd wave"

Jeremy

--
You received this message because you are subscribed to the Google Groups "mpi4py" group.
To view this discussion on the web visit https://groups.google.com/d/msg/mpi4py/-/_ZoWe3_oEy0J.
To post to this group, send email to mpi...@googlegroups.com.
To unsubscribe from this group, send email to mpi4py+un...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/mpi4py?hl=en.

Jeremy Bejarano

unread,

May 8, 2012, 12:12:31 PM5/8/12

to mpi...@googlegroups.com

Yeah, that worked right. Using the unbuffered option seems make it work more normal, the lines becomes fully interspersed with each other. Using the system flush works nicely.

To unsubscribe from this group, send email to mpi4py+unsubscribe@googlegroups.com.

Jeremy Bejarano

unread,

May 8, 2012, 1:14:54 PM5/8/12

to mpi...@googlegroups.com

Actually, i'm still getting something strange.

So I use the following code:

import sys

from mpi4py import MPI

comm = MPI.COMM_WORLD

rank = comm.Get_rank()

if rank == 0:

print "x = process rank"

print "--------------------"

x = rank

sys.stdout.flush()

comm.Barrier()

#print process rank and the value of variable "x" on that process

print "process", rank, ": x =", x

sys.stdout.flush()

comm.Barrier()

if rank == 0:

print "x = process rank * 2"

print "--------------------"

x = rank * 2

sys.stdout.flush()

comm.Barrier()

print "process", rank, ": x =", x

And I still sometimes get this:

x = process rank

--------------------

process 0 : x = 0

x = process rank * 2

--------------------

process 1 : x = 1

process 1 : x = 2

process 2 : x = 2

process 2 : x = 4

process 3 : x = 3

process 3 : x = 6

process 0 : x = 0

If I use the unbuffered option -u :

$ mpiexec -n 4 python26 -u parallel.py

Then I get something like this:

x = process rank

--------------------

processprocess process 2 : x = 0 : x = 0

x = process rank * 2

--------------------

process 0 : x = 01 : x = 1

process 1 : x = 2

process 3 : x = 3

process 3 : x = 6

2

process 2 : x = 4

Now, I know I could just do I/O from the root process, but what if I wanted to do parallel I/O on a file?

Lisandro Dalcin

unread,

May 8, 2012, 2:59:23 PM5/8/12

to mpi...@googlegroups.com

You SHOULD NOT rely on any ordering for "standard" I/O stream or
files. This is not a Python or mpi4py related issue, but for any
MPI-based program, even if written in C/C++/Fortran.

> Now, I know I could just do I/O from the root process, but what if I wanted
> to do parallel I/O on a file?
>

If you need to do parallel I/O, you can use any of the following two approaches:

1) Write: send data to process 0, then write from process 0. Read:
read in process 0, then send/broadcast to others. Should work fine for
a couple of 100s to 1000s of processes.

2) Use MPI I/O, for that you need to use mpi4py's MPI.File objects and
read about its many features in the MPI standard, books, tutorials,
etc. This option is best suited if you have to read/write collectively
large amounts of data in binary format.

There is actually and additional way, by using a consistent
"serialized" approach, see this example:
http://code.google.com/p/mpi4py/source/browse/demo/sequential/test_seq.py

--
Lisandro Dalcin
---------------
CIMEC (INTEC/CONICET-UNL)
Predio CONICET-Santa Fe
Colectora RN 168 Km 472, Paraje El Pozo
3000 Santa Fe, Argentina
Tel: +54-342-4511594 (ext 1011)
Tel/Fax: +54-342-4511169

ashu...@gmail.com

unread,

Jul 13, 2017, 1:50:41 AM7/13/17

to mpi4py

Apologies for necrobumping, but I have a related question. I have this test code. Where I do the following:

Write a test message to a file with rank 0 > MPI barrier > Read the test message > Assert equal > Repeat

It works for a few 100 or 1000 iterations until it stops - when the message read is a blank string, implying the barrier did not block the file I/O. Even if I uncomment the flush and os.fsync function calls, the code runs slower but still stops at some point.

from __future__ import print_function
import os

from mpi4py import MPI


comm = MPI.COMM_WORLD
rank = comm.Get_rank()

loop = True


def main():
    global loop
    txt_write = 'buhahaha'

    with open('test', 'w') as f1:
        if rank == 0:
            f1.write(txt_write)

        # f1.flush()
        # os.fsync(f1.fileno())

    comm.barrier()

    with open('test') as f2:
        txt_read = f2.read()

    try:
        assert txt_read == txt_write
    except:
        print("Assertion error ", txt_read, "!=", txt_write, 'rank=', rank)
        loop = False
    finally:
        comm.barrier()
        if rank == 0:
            os.remove('test')


if __name__ == '__main__':
    i = 0
    while loop:
        main()
        if i % 1000 == 0 and rank == 0:
            print("Iterations:", i)

        i += 1

On Tuesday, 8 May 2012 20:59:23 UTC+2, Lisandro Dalcin wrote:

You SHOULD NOT rely on any ordering for "standard" I/O stream or
files. This is not a Python or mpi4py related issue, but for any
MPI-based program, even if written in C/C++/Fortran.

Am I to understand that it is nearly impossible for MPI barrier to wait until (not only "standard" I/O stream but also)
"file" I/O streams have been flushed?

Will use of MPI file write fix this problem?

Lisandro Dalcin

unread,

Jul 13, 2017, 2:13:49 AM7/13/17

to mpi4py

On 12 July 2017 at 18:18, <ashu...@gmail.com> wrote:

>
> def main():
> global loop
> txt_write = 'buhahaha'
>
> with open('test', 'w') as f1:
> if rank == 0:
> f1.write(txt_write)
>
> # f1.flush()
> # os.fsync(f1.fileno())
>

This code smells bad, you should open/write/flush/sync in rank 0, you
are opening in all, then write in 0, then flushing and syncing in all.
This is a race condition. Using files with MPI is not different than
using files with threads or multiple processes.

>
> On Tuesday, 8 May 2012 20:59:23 UTC+2, Lisandro Dalcin wrote:
>>
>>
>> You SHOULD NOT rely on any ordering for "standard" I/O stream or
>> files. This is not a Python or mpi4py related issue, but for any
>> MPI-based program, even if written in C/C++/Fortran.
>>
>
> Am I to understand that it is nearly impossible for MPI barrier to wait
> until (not only "standard" I/O stream but also)
> "file" I/O streams have been flushed?
>

I did not say that. However, let me clarify something: when you use
MPI_barrier(), you are guaranteed that no process will leave the
barrier until all processes have entered it. Just writing to a file in
one of these processes is usually not enough, because of buffering. If
you do flushing/syncing explicitly, then I would say yes, an MPI
barrier is equivalent to "waiting" for I/O in all processes, but you
have to do it the right way:

if comm.rank == 0:
with open(..., 'w') as f:
f.write(...)
f.flush()
comm.Barrier()
# now all processes should be able to
# open the file and read same contents

> Will use of MPI file write fix this problem?
>

If you need to perform multiple writes from multiple processes to a
single file in a coordinated manner, then yes, MPI.File is the "right"
way of doing it. But of course, you have to learn the semantics of MPI
I/O, and understand how the many, many different read/write methods
work.

--
Lisandro Dalcin
============
Research Scientist
Computer, Electrical and Mathematical Sciences & Engineering (CEMSE)
Extreme Computing Research Center (ECRC)
King Abdullah University of Science and Technology (KAUST)
http://ecrc.kaust.edu.sa/

4700 King Abdullah University of Science and Technology
al-Khawarizmi Bldg (Bldg 1), Office # 0109
Thuwal 23955-6900, Kingdom of Saudi Arabia
http://www.kaust.edu.sa

Office Phone: +966 12 808-0459

Latham, Robert J.

unread,

Aug 29, 2017, 4:37:50 PM8/29/17

to mpi...@googlegroups.com

On Thu, 2017-07-13 at 09:13 +0300, Lisandro Dalcin wrote:
> On 12 July 2017 at 18:18, <ashu...@gmail.com> wrote:
>
> >
> > def main():
> > global loop
> > txt_write = 'buhahaha'
> >
> > with open('test', 'w') as f1:
> > if rank == 0:
> > f1.write(txt_write)
> >
> > # f1.flush()
> > # os.fsync(f1.fileno())
> >
>
> This code smells bad, you should open/write/flush/sync in rank 0, you
> are opening in all, then write in 0, then flushing and syncing in
> all.

First off, if you are doing this to an NFS file, then good luck. NFS
does not provide strict-enough mechanisms to persist data.

There are well-defined rules in MPI for when I/O is visible. Sync is
not enough. Barrier is not enough. Sync/barrier/sync is required, for
exactly the reason Lisandro points out.

You can also use MPI routines to close and reopen the file --
MPI_FIle_close and MPI_File_open are collective and impose enough
synchronization.

http://pythonhosted.org/mpi4py/usrman/tutorial.html#mpi-io

==rob

Reply all

Reply to author

Forward