demo.r does not run...

97 views
Skip to first unread message

john alexander sanabria ordonez

unread,
Oct 7, 2014, 6:00:06 PM10/7/14
to rbigdatap...@googlegroups.com
Hi,

I'm having problems running pdbR and OpenMPI. I am able to run the following command,

hpuser@master:~$ Rscript /shared/demo.r
COMM.RANK = 0
Hello world

Where demo.r is

library(pbdMPI, quiet = TRUE)
init()
comm.cat("Hello world\n")
finalize()


However, when I try to run the demo.r program as follows:

mpirun -np 2 --hostfile machinefile Rscript /shared/demo.r

When I check the processes in one of the nodes in the cluster, I see that there exists a R process but it does not finish.

 1337 hpuser    20   0  184m  35m 4712 R 99.9  3.6   1:08.56 R      

Thanks,

john alexander sanabria ordonez

unread,
Oct 8, 2014, 12:22:40 AM10/8/14
to rbigdatap...@googlegroups.com
I did not write well my initial post.

When I run "mpirun ..." the program does not finish. However seems that something was happening in the cluster nodes because they are running a R process which consumes 100% of CPU.

How I can debug the execution of Rscript in order to determine what is wrong with this execution?

thanks a lot,

Dale Wang

unread,
Oct 8, 2014, 4:13:22 AM10/8/14
to rbigdatap...@googlegroups.com
Hi,
   Try to run the program on a single node with two mpi processes. Run the following command:
  mpirun -np 2 Rscript /shared/demo.r
    It will run the program with two mpi processes on the node where you run the command. If it goes well, the MPI is installed properly.
    By the way, have you installed all pbdR packages on every node in the cluster? Every node needs to install pbdR, not just the master node where you compile the pbdR packages.

Best wishes,
                                 Dale

在 2014年10月8日星期三UTC+8下午12时22分40秒,john alexander sanabria ordonez写道:

john alexander sanabria ordonez

unread,
Oct 8, 2014, 7:26:26 PM10/8/14
to rbigdatap...@googlegroups.com
Hi Dale,

I run the command as you suggested and it worked

hpuser@master:~$ mpirun -np 2 Rscript /shared/demo.r
COMM.RANK = 0
Hello world

Question, that command should not print two "Hello world" sentences?

I did install pdbMPI in all nodes (master and working nodes) as follows:

sudo R CMD INSTALL pbdMPI

That is correct?

Dale Wang

unread,
Oct 8, 2014, 9:34:31 PM10/8/14
to rbigdatap...@googlegroups.com
The install process is correct.
Use comm.print(...,all.rank=T) in the demo.r to see the output from two processes.
I recommend you to try run the program with just two nodes next.
use "mpirun -host master,slavexx -np 2 Rscript /shared/demo.r" and it should work.
It it does not, there may be something wrong in MPI. I suffer the problem of MPI at first too. Just wait a long time to see whether MPI reports some problems to the output. I got the network error report from OpenMPI after nearly 5 minutes after the program starts.

在 2014年10月9日星期四UTC+8上午7时26分26秒,john alexander sanabria ordonez写道:

john alexander sanabria ordonez

unread,
Oct 9, 2014, 1:00:38 AM10/9/14
to rbigdatap...@googlegroups.com
OK, you are right, the "..., all.rank=T)" worked but the "mpirun -host master,wn01 -np 2 Rscript /shared/demo.r" did not work.

I submitted this command, from master node, "mpirun -host wn01 -np 2 Rscript /shared/demo.r" and it worked. Same happened if I use "wn02" instead of "wn01".

However "mpirun -host wn01,wn02 -np 2 Rscript /shared/demo.r" did not work although both node's CPUs reached about 100% of utilization by an R process (so something is happening). I left the processes run for about a half hour but nor progress neither error messages.

Wei-Chen Chen

unread,
Oct 9, 2014, 7:34:39 PM10/9/14
to rbigdatap...@googlegroups.com
FYI.
pbdMPI vignettes, Section 8.1, FAQ, question 8.

Cristina Montañola

unread,
Jan 12, 2015, 11:57:41 AM1/12/15
to rbigdatap...@googlegroups.com
Hello,

I'm trying to run also demo.r as described in this thread and encountered the same problem John Alexander described. 
I checked pbdMPI vignettes, Section 8.1, FAQ, question 8 as suggested and found there might be problems when having MPICH and openMPI both installed, so I removed MPICH from my cluster. Now I only have openmpi:
$ apt-show-versions 
...
libopenmpi-dev:amd64/trusty 1.6.5-8 uptodate
libopenmpi1.6:amd64/trusty 1.6.5-8 uptodate
openmpi-bin:amd64/trusty 1.6.5-8 uptodate
openmpi-common:all/trusty 1.6.5-8 uptodate

However, I can't run the demo.r example either. As John Alexander described, when I run the program it does not finish but the CPU is very busy executing a R process.

To test the cause is a communication problem, I tried another example: 'hello.r' with the following code
print("hello")
print(Sys.info()["nodename"])

I get a normal result when running it
$ mpiexec --hostfile my_hostfile -np 2 Rscript hello.r
[1] "hello"
nodename
"master"
[1] "hello"
nodename
"slave"

However, if hello.r is extended to include some pbdR code as follows:
print("hello")
print(Sys.info()["nodename"])
library(pbdMPI, quiet = TRUE)
init()
x <-100
comm.print(x, rank.print=0)
finalize()

When running it, I get
$ mpiexec --hostfile my_hostfile -np 2 Rscript hello.r
[1] "hello"
nodename
"master"
[1] "hello"
nodename
"slave"
and it never finishes... So the communication is working but apparently there is an issue with pbdMPI which I am not able to solve.

Could anybody kindly help me? 

Thanks,

Cristina

Wei-Chen Chen

unread,
Jan 14, 2015, 8:15:10 PM1/14/15
to rbigdatap...@googlegroups.com
Dear Cristina,

1. Make sure pbdMPI is reinstalled with openMPI correctly on all nodes.
2. Make sure firewalls of all nodes are open to all for sending to and receiving from all.
3. Make sure all nodes can access the same hello.r file.

Sincerely,
Wei-Chen Chen

Cristina Montañola

unread,
Jan 15, 2015, 7:13:51 AM1/15/15
to rbigdatap...@googlegroups.com
Dear Wei-Chen,

I also saw you have an image for virtualbox to run pbdR: http://thirteen-01.stat.iastate.edu/snoweye/pbdr/?item=vm&subitem=test_pbdr However, if I want to mount a cluster I need some root privileges to configure it. Is it possible to know the username and password of this image?

Best regards,

Cristina 

--
Programming with Big Data in R
Simplifying Scalability
http://r-pbd.org/
---
You received this message because you are subscribed to the Google Groups "RBigDataProgramming" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rbigdataprogram...@googlegroups.com.
To post to this group, send email to rbigdatap...@googlegroups.com.
Visit this group at http://groups.google.com/group/rbigdataprogramming.
To view this discussion on the web visit https://groups.google.com/d/msgid/rbigdataprogramming/b6607f88-1af8-4875-9689-fdfef4a6e806%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Wei-Chen Chen

unread,
Jan 15, 2015, 11:56:52 PM1/15/15
to rbigdatap...@googlegroups.com
Dear Cristina,

I don't remember any image, but there is a lot of way to change root password without knowing it.
sudo passwd root
or ask Google.

Sincerely,
Wei-Chen Chen
To unsubscribe from this group and stop receiving emails from it, send an email to rbigdataprogramming+unsub...@googlegroups.com.
To post to this group, send email to rbigdataprogramming@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages