Re: SrVO3 SIESTA mpiexec issue

62 views
Skip to first unread message

Uthpala Herath

unread,
Jun 3, 2022, 9:20:38 PM6/3/22
to Joshua Gray, DMFTwDFT
Hello Josh, 

I'm really sorry for not getting back to you any sooner. I just moved to a new city for a postdoc position and the whole moving process has been hectic.
Luckily, I think I'm finally at a point where I am kind of settled in enough to get back on track with the backlog of stuff I had put on the backburner. 

Thanks for elaborating on your issue. I have not seen the error in ksum_error_XHF0 before. However, given it outputs information in ksum_output_XHF0 we can assume that the XHF0 calculation is okay. Something is going on with dmft.x. What did you get in ksum_output_dmft and ksum_error_dmft? I think the answer to this issue lies there. 

Thanks, 

Best,
Uthpala

On Mon, May 30, 2022 at 12:20 AM Joshua Gray <joshua....@gmail.com> wrote:
Hi Uthpala,

I just realised you may not have received my original reply to your email due to file size restrictions. Just in case you did not receive my reply, the original message is below. Instead of sending the entire DMFT folder generated when I run the script (as I tried to do earlier), I have attached an image of the files the script has generated in this folder up until the point of the error. Please let me know if you would like me to send any of those files through to you. If you did receive my original reply and are working on the issue I apologise for this email and will wait patiently for any information on my errors.



In the ksum_error_dmft.x file it says that there is an unknown option "-pmi_args". I have contacted the help desk for the supercomputer and they assure me -pmi_args is not an option for mpiexec. I have tried going through the scripts to find the line specifying the option but have been unsuccessful. I also have a simultaneous error file (ksum_error_XHF0) which throws an error with the ml_discover_hierachy. Have you seen such errors and do you have any ideas how to solve them? It seems like an issue with parallel processing but I do not know enough to understand how to fix or sidestep this problem. I have attached the error files and the ksum_output_XHF0 file for you to have a look at if it helps. In the ksum_error_dmft.x file, lines 9-73 are purely related to me killing the job due to the code hanging and are unrelated to the issue.

I have also included the general DMFT.error file which states an error converting a string to a float. I am unsure if this refers to INFO_KSUM so I have supplied that file as well. Initially there is a line of data in INFO_KSUM, but it must throw an error generating the next line.

I apologise for not being able to pinpoint the error further than this. I am new to a lot of this.

Please let me know if there is further information you require.

Best,
Josh

On Thu, May 19, 2022 at 12:35 PM Uthpala Herath <ukh...@mix.wvu.edu> wrote:
Dear Joshua, 

Can you please elaborate a bit more on the error so we can find the source of the problem?

Thank you, 

Best,
Uthpala

On Wed, May 18, 2022 at 9:19 PM Joshua Gray <joshua....@gmail.com> wrote:
Dear development team, I posted a conversation into the DMFTwDFT group on April 11 in regards to an mpiexec issue I have been receiving, and I wanted to follow up on it in case it was missed. Unfortunately I am at the point where I cannot find the source of the error message I am receiving. This has caused my research using this method to grind to a halt for the moment. I realise that you are all very busy people, so any help you can provide on this issue is greatly appreciated. Kind regards, Joshua Gray

--
You received this message because you are subscribed to the Google Groups "DMFTwDFT" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dmftwdft+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dmftwdft/5e389732-8267-4d96-96a1-0652da1477e1n%40googlegroups.com.


--
Uthpala Herath
Postdoctoral Associate
Department of Mechanical Engineering and Materials Science
Duke University
Durham, NC



--
Uthpala Herath
Postdoctoral Associate
Department of Mechanical Engineering and Materials Science
Duke University
Durham, NC

Uthpala Herath

unread,
Jun 6, 2022, 9:50:10 PM6/6/22
to Joshua Gray, DMFTwDFT
This is very strange. In RUNDMFT.py, dmft.x is executed with mpirun -np <NUMBER OF PROCS> dmft.x. 
Can you try to go into the DMFT directory and maybe get an interactive job and run mpirun -np <num cores> dmft.x and see what happens?

- Uthpala

On Mon, Jun 6, 2022 at 7:20 PM Joshua Gray <joshua....@gmail.com> wrote:
Hi Uthpala,

That does sound hectic. I'm glad you're settling into your new position and city.

The ksum_output_dmft is completely empty, and the ksum_error_dmft file contains the original error message stating "unknown option "-pmi_args"". Everything below the line stating "Type 'mpiexec --help' for usage." is purely due to me terminating the job. I've spent a lot of time searching for where "-pmi_args" is defined but have yet to find it. I have attached ksum_error_dmft for you to look at (I cannot attach ksum_output_dmft because it is 0 bytes).

Best,
Josh
Reply all
Reply to author
Forward
0 new messages