Error while pio status

227 views
Skip to first unread message

hed...@gmail.com

unread,
Jun 7, 2017, 9:32:15 AM6/7/17
to actionml-user
I'm trying to setup PIO 0.11.0 in docker using this all-in-one guide and get error (Error initializing storage client for source HDFS) while pio status:
root@c63e6c937cba:/# pio status
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/pio/pio-0.11.0/lib/spark/pio-data-hdfs-assembly-0.11.0-incubating.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/pio/pio-0.11.0/lib/pio-assembly-0.11.0-incubating.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
[INFO] [Management$] Inspecting PredictionIO...
[INFO] [Management$] PredictionIO 0.11.0-incubating is installed at /opt/pio/pio-0.11.0
[INFO] [Management$] Inspecting Apache Spark...
[INFO] [Management$] Apache Spark is installed at /usr/local/spark
[INFO] [Management$] Apache Spark 1.6.3 detected (meets minimum requirement of 1.3.0)
[INFO] [Management$] Inspecting storage backend connections...
[INFO] [Storage$] Verifying Meta Data Backend (Source: ELASTICSEARCH)...
[INFO] [Storage$] Verifying Model Data Backend (Source: HDFS)...
[ERROR] [Storage$] Error initializing storage client for source HDFS
[ERROR] [Management$] Unable to connect to all storage backends successfully.
The following shows the error message from the storage backend.

Data source HDFS was not properly initialized. (org.apache.predictionio.data.storage.StorageClientException)

Dumping configuration of initialized storage backend sources.
Please make sure they are correct.

Source Name: ELASTICSEARCH; Type: elasticsearch; Configuration: HOME -> /usr/local/elasticsearch, HOSTS -> c63e6c937cba, PORTS -> 9300, CLUSTERNAME -> pio, TYPE -> elasticsearch
Source Name: HDFS; Type: (error); Configuration: (error)

Not using aml user and do everything under root. Hbase 1.2.5 not available anymore, so using Hbase 1.2.6.
All the configs in place of some-master I'm using c63e6c937cba. Checked all of them multiple times and found no difference from guilde.

I noticed this in pio.log, while pio-start-all:
...
2017-06-07 12:57:17,147 INFO  org.apache.predictionio.tools.commands.Management$ [main] - Creating Event Server at 0.0.0.0:7070
2017-06-07 12:57:19,051 WARN  org.apache.hadoop.hbase.util.DynamicClassLoader [main] - Failed to identify the fs of dir hdfs://c63e6c937cba:9000/hbase/lib, ignored
java.io.IOException: No FileSystem for scheme: hdfs
...

and while pio status:
...
2017-06-07 13:01:18,874 INFO  org.apache.predictionio.data.storage.Storage$ [main] - Verifying Model Data Backend (Source: HDFS)...
2017-06-07 13:01:19,298 ERROR org.apache.predictionio.data.storage.Storage$ [main] - Error initializing storage client for source HDFS
java.io.IOException: No FileSystem for scheme: hdfs
        at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2586)
...

So i tried to solve this error, modifying core-site.xml, explained in this SO question, but it only get me to the next stack of errors, usolvable at the moment.
pio-start-all:
...
2017-06-07 13:12:46,937 INFO  org.apache.predictionio.tools.commands.Management$ [main] - Creating Event Server at 0.0.0.0:7070
2017-06-07 13:12:49,494 ERROR org.apache.predictionio.data.storage.hbase.StorageClient [main] - Failed to connect to HBase. Please check if HBase is running properly.
2017-06-07 13:12:49,494 ERROR org.apache.predictionio.data.storage.Storage$ [main] - Error initializing storage client for source HBASE
java.io.IOException: java.lang.reflect.InvocationTargetException
...
Caused by: java.lang.ClassNotFoundException: Class org.apache.hadoop.hdfs.DistributedFileSystem not found
...
pio status:
...
2017-06-07 13:14:36,482 INFO  org.apache.predictionio.data.storage.Storage$ [main] - Verifying Meta Data Backend (Source: ELASTICSEARCH)...
2017-06-07 13:14:38,490 INFO  org.apache.predictionio.data.storage.Storage$ [main] - Verifying Model Data Backend (Source: HDFS)...
2017-06-07 13:14:38,859 ERROR org.apache.predictionio.data.storage.Storage$ [main] - Error initializing storage client for source HDFS
java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.hdfs.DistributedFileSystem not found
...

Googled new error on SO and tried pio status -- --jars /path/to/hadoop-hdfs-*.jar - didn't help.

Configs:
core-site.xml (final after adding fs.*.impl)
<configuration>
   <property>
      <name>fs.defaultFS</name>
      <value>hdfs://c63e6c937cba:9000/</value>
   </property>

   <property>
      <name>fs.file.impl</name>
      <value>org.apache.hadoop.fs.LocalFileSystem</value>
   </property>

   <property>
      <name>fs.hdfs.impl</name>
      <value>org.apache.hadoop.hdfs.DistributedFileSystem</value>
   </property>
</configuration>

hdfs-site.xml
<configuration>
   <property>
      <name>dfs.data.dir</name>
      <value>file:///usr/local/hadoop/dfs/name/data</value>
      <final>true</final>
   </property>

   <property>
      <name>dfs.name.dir</name>
      <value>file:///usr/local/hadoop/dfs/name</value>
      <final>true</final>
   </property>

   <property>
      <name>dfs.replication</name>
      <value>2</value>
   </property>
</configuration>

hbase-site.xml
<configuration>
    <property>
        <name>hbase.rootdir</name>
        <value>hdfs://c63e6c937cba:9000/hbase</value>
    </property>

    <property>
        <name>hbase.zookeeper.property.dataDir</name>
        <value>hdfs://c63e6c937cba:9000/zookeeper</value>
    </property>

    <property>
        <name>hbase.zookeeper.quorum</name>
        <value>localhost</value>
    </property>

    <property>
        <name>hbase.zookeeper.property.clientPort</name>
        <value>2181</value>
    </property>
</configuration>

pio-env.sh
#!/usr/bin/env bash

# Safe config that will work if you expand your cluster later
SPARK_HOME=/usr/local/spark
ES_CONF_DIR=/usr/local/elasticsearch/config
HADOOP_CONF_DIR=/usr/local/hadoop/etc/hadoop
HBASE_CONF_DIR=/usr/local/hbase/conf

# Filesystem paths where PredictionIO uses as block storage.
PIO_FS_BASEDIR=$HOME/.pio_store
PIO_FS_ENGINESDIR=$PIO_FS_BASEDIR/engines
PIO_FS_TMPDIR=$PIO_FS_BASEDIR/tmp

PIO_STORAGE_REPOSITORIES_METADATA_NAME=pio_meta
PIO_STORAGE_REPOSITORIES_METADATA_SOURCE=ELASTICSEARCH

PIO_STORAGE_REPOSITORIES_EVENTDATA_NAME=pio_event
PIO_STORAGE_REPOSITORIES_EVENTDATA_SOURCE=HBASE

PIO_STORAGE_REPOSITORIES_MODELDATA_NAME=pio_model
PIO_STORAGE_REPOSITORIES_MODELDATA_SOURCE=HDFS
# PIO_STORAGE_REPOSITORIES_MODELDATA_SOURCE=LOCALFS

# Elasticsearch Example
PIO_STORAGE_SOURCES_ELASTICSEARCH_TYPE=elasticsearch
PIO_STORAGE_SOURCES_ELASTICSEARCH_HOME=/usr/local/elasticsearch
# the next line should match the cluster.name in elasticsearch.yml
PIO_STORAGE_SOURCES_ELASTICSEARCH_CLUSTERNAME=pio

# For single host Elasticsearch, may add hosts and ports later
PIO_STORAGE_SOURCES_ELASTICSEARCH_HOSTS=c63e6c937cba
PIO_STORAGE_SOURCES_ELASTICSEARCH_PORTS=9300

# dummy models are stored here so use HDFS in case you later want to
# expand the Event and PredictionServers
PIO_STORAGE_SOURCES_HDFS_TYPE=hdfs
PIO_STORAGE_SOURCES_HDFS_PATH=hdfs://c63e6c937cba:9000/models

# localfs storage, because hdfs won't work
# PIO_STORAGE_SOURCES_LOCALFS_TYPE=localfs
# PIO_STORAGE_SOURCES_LOCALFS_PATH=${PIO_FS_BASEDIR}/models

# HBase Source config
PIO_STORAGE_SOURCES_HBASE_TYPE=hbase
PIO_STORAGE_SOURCES_HBASE_HOME=/usr/local/hbase
# Hbase single master config
PIO_STORAGE_SOURCES_HBASE_HOSTS=c63e6c937cba
PIO_STORAGE_SOURCES_HBASE_PORTS=0

In pio-env.sh tried LOCALFS for modeldata storage and it worked, at least no errors while pio status.

hdfs works fine, i guess, no errors while hdfs fsck /models etc.

Any help here?

Pat Ferrel

unread,
Jun 7, 2017, 11:29:32 AM6/7/17
to hed...@gmail.com, actionml-user, us...@predictionio.incubator.apache.org
This group is for support of ActionML projects like the Universal Recommender.

Please direct PIO questions to the Apache PIO mailing list. 


--
You received this message because you are subscribed to the Google Groups "actionml-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to actionml-use...@googlegroups.com.
To post to this group, send email to action...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/actionml-user/3ef2281d-d93e-41df-8496-e67d69b76445%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Message has been deleted

Alexey Nikulov

unread,
Jun 7, 2017, 11:55:15 AM6/7/17
to actionml-user
Sorry, saw similar topics here, didn't think of another place to ask.

Anyway, solved it by editing /storage/hdfs/build.sbt before ./make-distribution.sh due to found information here and corresponding PR.

Pat Ferrel

unread,
Jun 7, 2017, 12:35:08 PM6/7/17
to Alexey Nikulov, actionml-user, us...@predictionio.incubator.apache.org
Sorry for the confusion over support, PIO has many components and the docker container you are using is of unknown origin (to me anyway) It seems to have misconfigured something. Please be sure to tell the author or create a PR for it so it can be fixed for other users, it’s one way to pay for free software.


--
You received this message because you are subscribed to the Google Groups "actionml-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to actionml-use...@googlegroups.com.
To post to this group, send email to action...@googlegroups.com.

Dan Guja

unread,
Jun 7, 2017, 1:30:14 PM6/7/17
to us...@predictionio.incubator.apache.org, Alexey Nikulov, actionml-user
Alexey,

Can you try to isolate the issue  and first check if HBase integration is working?

Try to setup PIO to use LOCALFS  for model storing: PIO_STORAGE_REPOSITORIES_MODELDATA_SOURCE=LOCALFS
And then check if PIO can connect to external HBASE.

I have a similar config and my client HBASE config looks like this:


<configuration>
   <property>
        <name>hbase.rootdir</name>
        <value>hdfs://hbase:9000/hbase</value>

    </property>
     <property>
        <name>hbase.zookeeper.property.dataDir</name>
        <value>hdfs://hbase:9000/zookeeper</value>

    </property>
    <property>
        <name>hbase.zookeeper.quorum</name>
        <value>hbase</value>

    </property>
  <property>
        <name>hbase.zookeeper.property.clientPort</name>
        <value>2181</value>
    </property>
    <property>
    <name>hbase.master</name>
    <value>hbase:60000</value>
    <description>The host and port that the HBase master runs at.</description>
  </property>
</configuration>


On Wed, Jun 7, 2017 at 11:35 AM, Pat Ferrel <p...@occamsmachete.com> wrote:
Sorry for the confusion over support, PIO has many components and the docker container you are using is of unknown origin (to me anyway) It seems to have misconfigured something. Please be sure to tell the author or create a PR for it so it can be fixed for other users, it’s one way to pay for free software.
On Jun 7, 2017, at 8:55 AM, Alexey Nikulov <hed...@gmail.com> wrote:

Sorry, saw similar topics here, didn't think of another place to ask.

Anyway, solved it by editing /storage/hdfs/build.sbt before ./make-distribution.sh due to found information here and corresponding PR.

On Wednesday, June 7, 2017 at 6:29:32 PM UTC+3, pat wrote:
This group is for support of ActionML projects like the Universal Recommender.

Please direct PIO questions to the Apache PIO mailing list. 

--
You received this message because you are subscribed to the Google Groups "actionml-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to actionml-user+unsubscribe@googlegroups.com.

To post to this group, send email to action...@googlegroups.com.

Alexey Nikulov

unread,
Jun 8, 2017, 7:18:44 AM6/8/17
to actionml-user
I kind of solved the issue, as i mentioned above, at least no errors and integration-test works, didn't check any further yet.

Proble was missing line in build.sbt of downloaded source code of pio. Details on the link and below:

GitHub user shimamoto opened a pull request:

https://github.com/apache/incubator-predictionio/pull/389

PIO-91 Fixed hadoop-hdfs artifact missing error

I made a mistake when I reviewed dependencies.
Basically, the problem is due to unavailability of the hadoop-hdfs jars.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/shimamoto/incubator-predictionio pio-91_hadoop-hdfs-missing

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-predictionio/pull/389.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #389


commit f60af2c4dd5d4656ccab189b5590a9dc451ffb27
Author: shimamoto
Date: 2017-06-05T11:55:45Z

PIO-91 Fixed hadoop-hdfs artifact missing error.

Reply all
Reply to author
Forward
0 new messages