mpi4py and Python logging

1,689 views
Skip to first unread message

Yury V. Zaytsev

unread,
Apr 18, 2013, 9:56:49 AM4/18/13
to mpi...@googlegroups.com
Hi,

I am wondering what are the best practices with respect to using mpi4py
with Python logging module...

I have a piece of code that heavily uses logging and that I want to
parallelize using mpi4py. Ideally I'd like to see all processes writing
to the same log, but the messages should be prefixed with ranks.

Now, the question is, what happens if processes call the logging
functions simultaneously? Will the logs get all mixed up, like lines
broken in the middle by outputs from different ranks, etc.? If yes, is
there a standard solution to this problem?

Right now, I'm considering proxying my log object through a decorator,
which just discards the output for ranks other than zero, or opening
separate logs per MPI rank, but I don't like either of these solutions.

Anyone?

Thanks,

--
Sincerely yours,
Yury V. Zaytsev


Aron Ahmadia

unread,
Apr 18, 2013, 10:54:44 AM4/18/13
to mpi...@googlegroups.com
I have a piece of code that heavily uses logging and that I want to parallelize using mpi4py. Ideally I'd like to see all processes writing to the same log, but the messages should be prefixed with ranks.

You certainly could do this with a custom handler.

Now, the question is, what happens if processes call the logging functions simultaneously? Will the logs get all mixed up, like lines broken in the middle by outputs from different ranks, etc.? If yes, is there a standard solution to this problem?

If you set the file access to line-buffered, you shouldn't see interlaced output.  Unfortunately, support for line-buffering is somewhat operating system specific, so you will probably have to test this out on whatever environments you are working in.

The common solution on supercomputers, in case you're curious, is to create a separate file per rank when profiling/debugging :)  On a really big run, frequently only some small fraction of all ranks will actually write logs, with the majority disabled as you suggested.  

Cheers,
Aron

Yury V. Zaytsev

unread,
Apr 18, 2013, 11:56:24 AM4/18/13
to mpi...@googlegroups.com
Hi Aron,

Thanks for your helpful comments!

On Thu, 2013-04-18 at 15:54 +0100, Aron Ahmadia wrote:
>
> You certainly could do this with a custom handler.

Yup, I'm already setting custom prefixes this way, so I will just
include the rank in there if the application is started in parallel
mode.

> If you set the file access to line-buffered, you shouldn't see
> interlaced output. Unfortunately, support for line-buffering is
> somewhat operating system specific, so you will probably have to test
> this out on whatever environments you are working in.

I see, this was the missing ingredient! I will check this out and
hopefully it will work both on my beowulfs and supercomputers I'm
currently using.

> The common solution on supercomputers, in case you're curious, is to
> create a separate file per rank when profiling/debugging :) On a
> really big run, frequently only some small fraction of all ranks will
> actually write logs, with the majority disabled as you suggested.

Sure, but I'm going to scale this code to millions of cores, and in this
situation just opening the file handles will take light years, so I'd
really like to have a solution for using the same log file for all
ranks.

If I switch from DEBUG to INFO or even WARN, the amount of logging is
going to be very moderate even considering the number of processes, but
still, it's critically important for me to be able to review the traces,
so I'd be unhappy about just skipping logging on most ranks.

I will see if I can post a minimal example to the list once I get it
figured out...

Z.

Lisandro Dalcin

unread,
Apr 19, 2013, 12:51:05 PM4/19/13
to mpi4py
On 18 April 2013 18:56, Yury V. Zaytsev <yu...@shurup.com> wrote:
>
> I will see if I can post a minimal example to the list once I get it
> figured out...
>

Why not use an MPI-based approach? Can you try the code below and tell
us how it works in large core counts?

from mpi4py import MPI
from random import random
from time import sleep

comm = MPI.COMM_WORLD
mode = MPI.MODE_WRONLY|MPI.MODE_CREATE#|MPI.MODE_APPEND
fh = MPI.File.Open(comm, "logfile.log", mode)
fh.Set_atomicity(True)

for i in range(10):
t = random() + 0.001
msg = "[%d] iter %d - begin\n" % (comm.rank, i)
fh.Write_shared(msg)
sleep(t)
msg = "[%d] iter %d - end\n" % (comm.rank, i)
fh.Write_shared(msg)

fh.Sync()
fh.Close()




--
Lisandro Dalcin
---------------
CIMEC (INTEC/CONICET-UNL)
Predio CONICET-Santa Fe
Colectora RN 168 Km 472, Paraje El Pozo
3000 Santa Fe, Argentina
Tel: +54-342-4511594 (ext 1011)
Tel/Fax: +54-342-4511169

chengd...@gmail.com

unread,
Feb 14, 2018, 4:31:24 PM2/14/18
to mpi4py
Thank you very much for the code. Here is my code now:

chengd...@gmail.com

unread,
Feb 20, 2018, 8:06:18 PM2/20/18
to mpi4py
I just found that the code works wrongly in CentOS 6.7 with Intel MPI 2017. The lines wrote is intermixed in the file and some lines are lost.



On Friday, April 19, 2013 at 12:51:05 PM UTC-4, Lisandro Dalcin wrote:

Lisandro Dalcin

unread,
Feb 21, 2018, 3:04:08 AM2/21/18
to mpi4py
On 21 February 2018 at 04:06, <chengd...@gmail.com> wrote:
> I just found that the code works wrongly in CentOS 6.7 with Intel MPI 2017.
> The lines wrote is intermixed in the file and some lines are lost.
>

https://software.intel.com/en-us/forums/intel-clusters-and-hpc-technology/topic/731124

> see also:
> https://bitbucket.org/mpi4py/mpi4py/issues/91/write_shared-problem-in-linux-platform
>

As this does not seems to be mpi4py's fault, I closed the issue.


--
Lisandro Dalcin
============
Research Scientist
Computer, Electrical and Mathematical Sciences & Engineering (CEMSE)
Extreme Computing Research Center (ECRC)
King Abdullah University of Science and Technology (KAUST)
http://ecrc.kaust.edu.sa/

4700 King Abdullah University of Science and Technology
al-Khawarizmi Bldg (Bldg 1), Office # 0109
Thuwal 23955-6900, Kingdom of Saudi Arabia
http://www.kaust.edu.sa

Office Phone: +966 12 808-0459
Reply all
Reply to author
Forward
0 new messages