the errors of opensoc vagrant

261 views
Skip to first unread message

姜政伟

unread,
Sep 12, 2015, 1:06:16 PM9/12/15
to OpenSOC Support
dear all,
      I downloaded the OpenSOC Vagrant from https://github.com/OpenSOC/opensoc-vagrant  and deployed as follows:
      0.the OS is  Win7 64 bits.
      1.installed the VirtualBox-4.3.30.
      2.installed the vagrant 1.7.3.
      3.installed the fabric and its dependencies:paramiko, ecdsa,pycrypto  in Visual Studio x64 Win64 cmd prompt.
      5.downloaded the opensoc-vagrant project files.
      6.cd to the project directory and modified the  " node.vm. box" to "centos65"   in Vagarantfile.
      7.downloaed the jre-7u79-linux-x64.rpm an put it under the resources folder.
      8. added "yum install -y wget" under the "function installDependencies" in the setup-os.sh.
      9. as the hbase 0.98.13 cannot be found in the apache's  mirrors,so searched and downloaded the apache-hive-1.2.0-bin.tar.gz and hbase-0.98.13-hadoop2-bin.tar.gz to the resources/tmp folder,or modified the "HBASE_VERSION_NUM" and "HIVE_VERSION" to the correct version number.
     10. cd to the opensoc-vagrant directory and run "vagrant up ".
         so far so good.
     11. run fab:
        i.runned "fab vagrant quickstart" ,and got the following snippet:

[node1] out: /************************************************************
[node1] out: SHUTDOWN_MSG: Shutting down NameNode at node1/10.0.0.101
[node1] out: ************************************************************/
[node1] out:
 
Warning: sudo() received nonzero return code 1 while executing '/opt/hadoop/bin/hdfs namenode -format vagrant -nonInteractive'!
[node1] Executing task 'supervisorctl_start'
[node1] sudo: supervisorctl start namenode
[node1] out: Traceback (most recent call last):
[node1] out:   File "/usr/bin/supervisorctl", line 5, in <module>
[node1] out:     from pkg_resources import load_entry_point
[node1] out:   File "/usr/lib/python2.6/site-packages/pkg_resources.py", line 2655, in <module>
[node1] out:     working_set.require(__requires__)
[node1] out:   File "/usr/lib/python2.6/site-packages/pkg_resources.py", line 648, in require
[node1] out:     needed = self.resolve(parse_requirements(requirements))
[node1] out:   File "/usr/lib/python2.6/site-packages/pkg_resources.py", line 546, in resolve
[node1] out:     raise DistributionNotFound(req)
[node1] out: pkg_resources.DistributionNotFound: meld3>=0.6.5
[node1] out:
 
Fatal error: sudo() received nonzero return code 1 while executing!
Requested: supervisorctl start namenode
Executed: sudo -S -p 'sudo password:'  /bin/bash -l -c "supervisorctl start namenode"
Aborting.
Disconnecting from 127.0.0.1:2202... done.
 
   and as the quickstart instructions, i opened the  browser from the host at https://localhost:8443,but got the page error of " temporarily unable to access", it was found that the localhost' 8443 port was not open by executing netstat -ano.
 
   ii. as well I runned "fab vagrant postsetup" and got the same resulsts:

[node1] out: Running in non-interactive mode, and data appears to exist in Storage Directory /var/lib/hadoop/hdfs/namenode. Not
formatting.
[node1] out: 15/09/12 16:27:40 INFO util.ExitUtil: Exiting with status 1
[node1] out: 15/09/12 16:27:40 INFO namenode.NameNode: SHUTDOWN_MSG:
[node1] out: /************************************************************
[node1] out: SHUTDOWN_MSG: Shutting down NameNode at node1/10.0.0.101
[node1] out: ************************************************************/
[node1] out:
 
Warning: sudo() received nonzero return code 1 while executing '/opt/hadoop/bin/hdfs namenode -format vagrant -nonInteractive'!
 
[node1] Executing task 'supervisorctl_start'
[node1] sudo: supervisorctl start namenode
[node1] out: Traceback (most recent call last):
[node1] out:   File "/usr/bin/supervisorctl", line 5, in <module>
[node1] out:     from pkg_resources import load_entry_point
[node1] out:   File "/usr/lib/python2.6/site-packages/pkg_resources.py", line 2655, in <module>
[node1] out:     working_set.require(__requires__)
[node1] out:   File "/usr/lib/python2.6/site-packages/pkg_resources.py", line 648, in require
[node1] out:     needed = self.resolve(parse_requirements(requirements))
[node1] out:   File "/usr/lib/python2.6/site-packages/pkg_resources.py", line 546, in resolve
[node1] out:     raise DistributionNotFound(req)
[node1] out: pkg_resources.DistributionNotFound: meld3>=0.6.5
[node1] out:
 
Fatal error: sudo() received nonzero return code 1 while executing!
 
Requested: supervisorctl start namenode
Executed: sudo -S -p 'sudo password:'  /bin/bash -l -c "supervisorctl start namenode"
 
Aborting.
Disconnecting from 127.0.0.1:2202... done.

     once i add "-w" after the command ,such as "fab vagrant postsetup -w" , the errors was replaced by following snippet:
 
Warning: sudo() received nonzero return code 1 while executing 'supervisorctl start regionserver'!
 
[node1] Executing task 'init_ip_whitelist'
[node1] run: /opt/hbase/bin/hbase shell /vagrant/resources/opensoc/hbase_ip_whitelist.rb
[node1] out: 2015-09-12 16:44:50,493 INFO  [main] Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.n
tive.lib.available
[node1] out: 2015-09-12 16:44:50,830 WARN  [main] conf.Configuration: bad conf file: element not <property>
[node1] out: 2015-09-12 16:44:53,537 WARN  [main] conf.Configuration: bad conf file: element not <property>
[node1] out: 2015-09-12 16:44:53,990 WARN  [main] conf.Configuration: bad conf file: element not <property>
[node1] out: 2015-09-12 16:44:54,464 WARN  [main] util.NativeCodeLoader: Unable to load native-hadoop library for your platform
.. using builtin-java classes where applicable
[node1] out: 2015-09-12 16:44:54,583 WARN  [main] conf.Configuration: bad conf file: element not <property>
[node1] out: 2015-09-12 16:45:11,894 ERROR [main] zookeeper.RecoverableZooKeeper: ZooKeeper exists failed after 4 attempts
[node1] out: 2015-09-12 16:45:11,894 WARN  [main] zookeeper.ZKUtil: hconnection-0x74754ccd0x0, quorum=node3:2181,node2:2181,nod
4:2181, baseZNode=/hbase-unsecure Unable to set watcher on znode (/hbase-unsecure/hbaseid)
[node1] out: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase-unsecure
hbaseid
[node1] out:    at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
[node1] out:    at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
[node1] out:    at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1045)
[node1] out:    at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:222)
[node1] out:    at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:481)
[node1] out:    at org.apache.hadoop.hbase.zookeeper.ZKClusterId.readClusterIdZNode(ZKClusterId.java:65)
[node1] out:    at org.apache.hadoop.hbase.client.ZooKeeperRegistry.getClusterId(ZooKeeperRegistry.java:83)
[node1] out:    at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.retrieveClusterId(HConnectionMan
ger.java:909)
[node1] out:    at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.<init>(HConnectionManager.java:73

    and as the port fowarding instructions,i opened the browser from the host at the following ports:
   https://localhost:8443 
   all pages returned th error of " temporarily unable to access",  however, it was found that except the 8443 port,the rest ports,ie. 50070,60010,8080,9200 all  were open by executing netstat -ano.Nevertheless, when I open the http://127.0.0.1:2202 in the host, i got the information of " SSH-2.0-OpenSSH_5.3 Protocol mismatch."
    Then i changed the localhost to 127.0.0.1, or uninstalled the virtualbox and installed version 5.0.4 or 4.3.26,or uninstalled vagrant and reinstalled it ,none worked for the aforementioned problems.

so here are the questions:
    1.are the deployment actions above correct?
    2. why such errors occured " ...Fatal error: sudo() received nonzero return code 1 while executing!..." after run ""fab vagrant quickstart" or ""fab vagrant postsetup",how to avoid it?
    3. why  these web pages can NOT be accessed?
    4. is there some test data in the opensoc-vagrant environment?

ps: are spark being used in the currnet openSOC or in the near future and in what scenarios?
      
thans a lot!

min...@gmail.com

unread,
Oct 24, 2015, 11:36:34 PM10/24/15
to OpenSOC Support
This is the answer of the question #2. 

After "vagrant ssh node1" and try the follows:

# test this command

sudo -S -p 'sudo password:' /bin/bash -l -c "supervisorctl start namenode"


# remove & upgrade python setuptools

sudo rm -rf /usr/lib/python2.6/site-packages/setuptools* /usr/lib/python2.6/site-packages/distribute*

wget https://bitbucket.org/pypa/setuptools/raw/bootstrap/ez_setup.py -O - | sudo python


# test this command

sudo start supervisor

mmal...@gmail.com

unread,
Feb 25, 2016, 2:22:13 AM2/25/16
to OpenSOC Support, min...@gmail.com
Hi, 

I was also facing the same problem as described in the post while installing opensoc from vagrant repo. I have also tried to work out the solutions as given below as well as here by downgrading the version of meld3 from 1.0.2 to 1.0.1. 

Still facing the same error. Any help would be highly appreciated .

Thanks
Mayank
Reply all
Reply to author
Forward
0 new messages