We have set DRBD partition set to be same size as RAM (96GB). Initial Configuration process (also tried Reformatting cluster process) runs for few hours in configuring namenode and job tracker with DRBD sync taking few hours.
The configuration process finishes successfully but HDFS fails in starting up. PrimaryNameNode and SecondaryNameNode checking steps fails inside Intel Manager. Pop up error shows "Checking ... failed"
There no error log or error message of what kind of checking failed.
On the Intel Manager HA area, ms_drbd_hadoop shows up as failed.
Output for “service drbd status” and “crm status” on Primary Node & Stand-by node:
=====PrimaryNode=====
[root@snshadoopnn log]# service drbd status
drbd driver loaded OK; device status:
version: 8.3.11 (api:88/proto:86-96)
GIT-hash: 0de839cee13a4160eed6037c4bddd066645e23c5 build by root@server-5254, 2013-01-28 14:37:17
m:res cs ro ds p mounted fstype
0:r0 Connected Secondary/Primary UpToDate/UpToDate C
[root@snshadoopnn log]# crm status
============
Last updated: Mon May 6 09:52:19 2013
Last change: Sun May 5 20:56:56 2013 via cibadmin on snshadoopnn.softnets.com
Stack: openais
Current DC: snshadoope1.softnets.com - partition with quorum
Version: 1.1.6-3.el6-a02c0f19a00c1eb2527ad38f146ebc0834814558
3 Nodes configured, 3 expected votes
9 Resources configured.
============
Node snshadoope1.softnets.com: standby
Online: [ snshadoopjt.softnets.com snshadoopnn.softnets.com ]
fs_hadoop (ocf::heartbeat:Filesystem): Started snshadoopjt.softnets.com
Master/Slave Set: ms_drbd_hadoop [drbd_hadoop]
Masters: [ snshadoopjt.softnets.com ]
Slaves: [ snshadoopnn.softnets.com ]
ip_hadoop (ocf::heartbeat:IPaddr2): Started snshadoopjt.softnets.com
ip_hadoop_jobtracker (ocf::heartbeat:IPaddr2): Started snshadoopjt.softnets.com
Failed actions:
mysqld_monitor_0 (node=snshadoope1.softnets.com, call=8, rc=5, status=complete): not installed
drbd_hadoop:0_monitor_0 (node=snshadoope1.softnets.com, call=3, rc=5, status=complete): not installed
=====SecondaryNode=====
[root@snshadoopjt ~]# service drbd status
drbd driver loaded OK; device status:
version: 8.3.11 (api:88/proto:86-96)
GIT-hash: 0de839cee13a4160eed6037c4bddd066645e23c5 build by root@server-5254, 2013-01-28 14:37:17
m:res cs ro ds p mounted fstype
0:r0 Connected Primary/Secondary UpToDate/UpToDate C /hadoop/drbd ext3
[root@snshadoopjt ~]# crm status
============
Last updated: Mon May 6 09:59:13 2013
Last change: Sun May 5 20:56:56 2013 via cibadmin on snshadoopnn.softnets.com
Stack: openais
Current DC: snshadoope1.softnets.com - partition with quorum
Version: 1.1.6-3.el6-a02c0f19a00c1eb2527ad38f146ebc0834814558
3 Nodes configured, 3 expected votes
9 Resources configured.
============
Node snshadoope1.softnets.com: standby
Online: [ snshadoopjt.softnets.com snshadoopnn.softnets.com ]
fs_hadoop (ocf::heartbeat:Filesystem): Started snshadoopjt.softnets.com
Master/Slave Set: ms_drbd_hadoop [drbd_hadoop]
Masters: [ snshadoopjt.softnets.com ]
Slaves: [ snshadoopnn.softnets.com ]
ip_hadoop (ocf::heartbeat:IPaddr2): Started snshadoopjt.softnets.com
ip_hadoop_jobtracker (ocf::heartbeat:IPaddr2): Started snshadoopjt.softnets.com
Failed actions:
mysqld_monitor_0 (node=snshadoope1.softnets.com, call=8, rc=5, status=complete): not installed
drbd_hadoop:0_monitor_0 (node=snshadoope1.softnets.com, call=3, rc=5, status=complete): not installed
Thanks
-D
2.6.32-220.el6.x86_64
--
You received this message because you are subscribed to the Google Groups "IDH Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to idh-users+...@googlegroups.com.
To post to this group, send email to idh-...@googlegroups.com.
Visit this group at http://groups.google.com/group/idh-users?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.
*DRBD replication uses TCP ports 7788 through 7799, with every resource listening
on a separate port. DRBD uses two TCP connections for every resource configured.
For proper DRBD functionality, the firewall on the Standby NameNode and Primary
NameNode cannot block TCP communication on these ports.
High Availability Operation Guide for Intel
® Distribution for Apache Hadoop* software 5Requirements and Recommendations for Setting Up High Availability
• Aside from DRBD, no other service or process may use TCP ports 7788 through
7799.
• Pacemaker, Corosync, and DRBD must not be installed on any node in the cluster
prior to HA configuration. The packages should only be installed from the repository
provided by the Intel
® Distribution.• The Standby NameNode and Primary NameNode must have the following kernel
version
2.6.32-279.el6.x86_64.adip...@yahoo.com; son...@gmail.com; anil.ch...@gmail.com; penu...@gmail.com; anet...@gmail.com; nik...@hotmail.com; mura...@hotmail.com; phaj...@yahoo.com; ksnr...@gmail.com;cnr...@hotmail.com; john.c...@gmail.com