HBase indexer with Solr 5.3 error

248 views
Skip to first unread message

Sofia Panagiotidi

unread,
Mar 1, 2016, 9:45:34 AM3/1/16
to HBase Indexer Users
Hello

I wrote a few months ago but I got no reply :(

I was wondering whether I can make HBase Indexer work with my Solr version 5.3.1. What I am facing is problems with the Zookeeper node structure that seems to be changed after Solr 5 and I am not sure how to overcome this.

I start my one node Solr with

solr start -c -Dsolr.directoryFactory=HdfsDirectoryFactory -Dsolr.lock.type=none -Dsolr.hdfs.home=hdfs://master:8020/user/ubuntu/solr -s node1/solr -z master:2181

and I create the HBase index as follows

hbase-indexer add-indexer --name myIndexer2 --indexer-conf ~/Desktop/indexdemo-indexer2.xml --cp solr.zk=master:2181 --cp solr.collection=sofiacollection51 --zookeeper master:2181

The indexer gets created all right, but when I try to add something to HBase, the error at the indexer server is

16/03/01 16:30:27 INFO zookeeper.ClientCnxn: Session establishment complete on server master-VirtualBox/192.168.1.44:2181, sessionid = 0x153322b8b9100a3, negotiated timeout = 30000
16/03/01 16:30:27 INFO cloud.ConnectionManager: Watcher org.apache.solr.common.cloud.ConnectionManager@71cb27d6 name:ZooKeeperConnection Watcher:master:2181 got event WatchedEvent state:SyncConnected type:None path:null path:null type:None
16/03/01 16:30:27 INFO cloud.ConnectionManager: Client is connected to ZooKeeper
16/03/01 16:30:27 INFO cloud.ZkStateReader: Updating cluster state from ZooKeeper... 
16/03/01 16:30:27 ERROR indexer.DirectSolrInputDocumentWriter: Error updating Solr
org.apache.solr.common.SolrException: Collection not found: sofiacollection51
at org.apache.solr.client.solrj.impl.CloudSolrServer.getCollectionList(CloudSolrServer.java:338)
at org.apache.solr.client.solrj.impl.CloudSolrServer.request(CloudSolrServer.java:219)
at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:117)
at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:116)
at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:102)
at com.ngdata.hbaseindexer.indexer.DirectSolrInputDocumentWriter.retryAddsIndividually(DirectSolrInputDocumentWriter.java:123)
at com.ngdata.hbaseindexer.indexer.DirectSolrInputDocumentWriter.add(DirectSolrInputDocumentWriter.java:108)
at com.ngdata.hbaseindexer.indexer.Indexer.indexRowData(Indexer.java:156)
at com.ngdata.hbaseindexer.indexer.IndexingEventListener.processEvents(IndexingEventListener.java:99)
at com.ngdata.sep.impl.SepEventExecutor$1.run(SepEventExecutor.java:97)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
^C16/03/01 16:30:27 INFO mortbay.log: Stopped SelectChann...@0.0.0.0:11060
16/03/01 16:30:28 INFO supervisor.IndexerSupervisor: IndexerWorker.EventWorker interrupted.
16/03/01 16:30:28 INFO zookeeper.ZooKeeper: Session: 0x153322b8b9100a1 closed
16/03/01 16:30:28 INFO zookeeper.ClientCnxn: EventThread shut down
16/03/01 16:30:28 INFO ipc.RpcServer: Stopping server on 41136
16/03/01 16:30:28 INFO ipc.RpcServer: RpcServer.listener,port=41136: stopping
16/03/01 16:30:28 INFO ipc.RpcServer: RpcServer.responder: stopped
16/03/01 16:30:28 INFO ipc.RpcServer: RpcServer.responder: stopping

When I check on the zookeeper's side I can see the collection "sofiacollection51" though:

[zk: master(CONNECTED) 5] ls /collections
[aliases.json, clusterstate.json, ngdata, sofiacollection51]

Any help appreciated
Sofia

Gabriel Reid

unread,
Mar 1, 2016, 10:04:09 AM3/1/16
to Sofia Panagiotidi, HBase Indexer Users
Hi Sofia,

Which build of hbase-indexer are you using? Did you build it yourself
(i.e. checked the code out of GitHub), or did you download binaries
somewhere?

If you've built it yourself, could you specify which version you've
built, and which maven profile (if any) you used when building it?

If you downloaded binaries for hbase-indexer, could you specify which
version you're using?

Thanks,

Gabriel
> --
> You received this message because you are subscribed to the Google Groups
> "HBase Indexer Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to hbase-indexer-u...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Sofia Panagiotidi

unread,
Mar 1, 2016, 10:11:47 AM3/1/16
to HBase Indexer Users, sof...@gmail.com
Hi Gabriel

If I remember correctly I downloaded and built it with

mvn clean install -DskipTests -Dhbase.api=0.98
My HBase version is 1.1.2 but I don't think this is of any problem
Cheers

Gabriel Reid

unread,
Mar 1, 2016, 10:15:13 AM3/1/16
to Sofia Panagiotidi, HBase Indexer Users
The current master of hbase-indexer in GitHub is built against Solr 4.4.0 [1]

The first thing I would suggest trying is just setting the Solr
version in the pom file to your Solr version, and rebuilding
hbase-indexer. In the best case, this will just work and it should
resolve the issues.

It is not entirely unlikely that you will encounter compilation errors
due to changes in the Solr API though, in which case you'd need to
make some small modifications to hbase-indexer in order to allow
building it against Solr 5.3.

- Gabriel




1. https://github.com/NGDATA/hbase-indexer/blob/master/pom.xml#L22
>> > email to hbase-indexer-u...@googlegroups.com.
>> > For more options, visit https://groups.google.com/d/optout.
>
> --
> You received this message because you are subscribed to the Google Groups
> "HBase Indexer Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to hbase-indexer-u...@googlegroups.com.

Sofia Panagiotidi

unread,
Mar 1, 2016, 12:30:12 PM3/1/16
to HBase Indexer Users, sof...@gmail.com
I just tried and after fixing some dependencies I got

[ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:2.0.2:compile (default-compile) on project hbase-indexer-engine: Compilation failure: Compilation failure:
[ERROR] /home/sofia/hbase-indexer/hbase-indexer-engine/src/main/java/com/ngdata/hbaseindexer/indexer/SolrServerFactory.java:[43,15] error: incompatible types
[ERROR]
[ERROR] could not parse error message:   required: SolrServer
[ERROR] found:    CloudSolrServer
[ERROR] /home/sofia/hbase-indexer/hbase-indexer-engine/src/main/java/com/ngdata/hbaseindexer/indexer/SolrServerFactory.java:49: error: no suitable method found for add(HttpSolrServer)
[ERROR] result.add(new HttpSolrServer(shard, httpClient));
[ERROR] ^


I am not sure I would be able to get into the code and do the fixing, I might give it a try later on.
Cheers
>> > For more options, visit https://groups.google.com/d/optout.
>
> --
> You received this message because you are subscribed to the Google Groups
> "HBase Indexer Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an

Ravi K

unread,
Mar 16, 2016, 12:40:28 PM3/16/16
to HBase Indexer Users

Gabriel,


Will hbase-indexer be updated anytime soon to account for recent Solr API changes?



got event WatchedEvent state:SyncConnected type:None path:null path:null type:None

2016-03-16 10:31:59,030 INFO org.apache.solr.common.cloud.ConnectionManager: Client is connected to ZooKeeper

2016-03-16 10:31:59,030 INFO org.apache.solr.common.cloud.SolrZkClient: Using default ZkACLProvider

2016-03-16 10:31:59,032 INFO org.apache.solr.common.cloud.ZkStateReader: Updating cluster state from ZooKeeper...

2016-03-16 10:31:59,040 ERROR com.ngdata.hbaseindexer.indexer.DirectSolrInputDocumentWriter: Error updating Solr

org.apache.solr.common.SolrException: Could not find collection : hpfcollection

        at org.apache.solr.common.cloud.ClusterState.getCollection(ClusterState.java:162)

        at org.apache.solr.client.solrj.impl.CloudSolrServer.directUpdate(CloudSolrServer.java:305)

        at org.apache.solr.client.solrj.impl.CloudSolrServer.request(CloudSolrServer.java:539)

        at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:124)

        at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:116)

        at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:102)

        at com.ngdata.hbaseindexer.indexer.DirectSolrInputDocumentWriter.retryAddsIndividually(DirectSolrInputDocumentWriter.java:123)

        at com.ngdata.hbaseindexer.indexer.DirectSolrInputDocumentWriter.add(DirectSolrInputDocumentWriter.java:108)

        at com.ngdata.hbaseindexer.indexer.Indexer.indexRowData(Indexer.java:156)

        at com.ngdata.hbaseindexer.indexer.IndexingEventListener.processEvents(IndexingEventListener.java:99)

        at com.ngdata.sep.impl.SepEventExecutor$1.run(SepEventExecutor.java:97)

        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)

        at java.util.concurrent.FutureTask.run(FutureTask.java:262)

        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

        at java.lang.Thread.run(Thread.java:745)




Gabriel Reid

unread,
Mar 17, 2016, 6:23:11 AM3/17/16
to Ravi K, HBase Indexer Users
Hi Ravi,

There are currently short-term plans (at least not for me) to upgrade
to a more recent version of Solr, as the CDH version of Solr is
currently 4.10.

However, patches to add this functionality (while remaining backwards
compatible with the CDH release of Solr) are of course welcome.

- Gabriel
> --
> You received this message because you are subscribed to the Google Groups
> "HBase Indexer Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to hbase-indexer-u...@googlegroups.com.

Ravi K

unread,
Apr 6, 2016, 12:22:18 PM4/6/16
to Gabriel Reid, HBase Indexer Users
hi Gabriel,

I have tried pretty much everything but the Hbase Indexer won't work.
Could you please give me some direction on where I need to focus to get this working.
Also there is an attachment with my configurations (had to scrub hostnames & IPs), let me know if I am doing anything wrong or if you need any additional information.

environment. -  hbase-solr indexing NOT WORKING

CDH 5.4.0
Solr 5.3.0

environment - hbase-solr indexing WORKING

CDH 5.5.1
Solr 5.2.1

Thanks for your help.

Ravi
hbase-indexer_error.docx

Gabriel Reid

unread,
Apr 7, 2016, 3:48:49 AM4/7/16
to Ravi K, hbase-ind...@googlegroups.com
Hi Ravi,

Two questions:

First: does the collection "hpfcollection" actually exist in your Solr
cluster? The error message seems to imply that the collection isn't
present in Solr.

Second: which build of hbase-indexer are you using in these two
setups? i.e. are they custom builds that you've made from source, or
builds provided as part of CDH? If you've made custom builds, have you
updated the version of solr in the pom file to match what is running
on the environment?

- Gabriel

Ravi K

unread,
Apr 7, 2016, 11:22:22 AM4/7/16
to Gabriel Reid, hbase-ind...@googlegroups.com
hi Gabriel,

Thank you very much for responding.

Please see details below.

1. I do see my collection in solr admin & zookeeper (screen shots below).

Inline image 2

Inline image 1

2. I am using build provided as part of CDH5.4.0


Regards,

Ravi

Ravi K

unread,
Apr 7, 2016, 11:22:26 AM4/7/16
to Gabriel Reid, hbase-ind...@googlegroups.com
hi Gabriel,

Thank you very much for responding.

Please see details below.

1. I do see my collection in solr admin & zookeeper (screen shots below).

Inline image 2

Inline image 1

2. I am using build provided as part of CDH5.4.0


Regards,

Ravi

Ravi K

unread,
Apr 7, 2016, 3:19:21 PM4/7/16
to Gabriel Reid, hbase-ind...@googlegroups.com
hi Gabriel,


1. Should I be seeing anything related to hbase indexer in the solr cloud dump ? If I do not see, is that a tell tale sign of problems?
2. What is the right format for solr.zk when adding an Indexer?

this one worked in my development cluster but doesn't seem to work in my client cluster.

hbase-indexer add-indexer \
--name hpfIndexer \
--indexer-conf /tmp/morphline-hbase-mapper.xml \
--connection-param solr.zk=node1,node2,node3   -----No port or solr at the end
--connection-param solr.collection=hpfcollection


Regards,

Ravi

Gabriel Reid

unread,
Apr 9, 2016, 2:12:36 AM4/9/16
to Ravi K, hbase-ind...@googlegroups.com
Hi Ravi,

Just to clarify what I said before, please ensure that you're using the Solrj version that matches the Solr server that you're using.

I'm not aware of any changes that may have caused a difference in behavior between those two versions of Solr, but it could be. 

I'm also not totally clear on what the change was that you've made to the hbase-indexer code here. However, again to clarify, you should be including the "/solr" suffix on the zookeeper connection string.

- Gabriel

On Fri, Apr 8, 2016 at 8:12 PM, Ravi K <rcka...@gmail.com> wrote:
It almost seems like an issue with Solrj library, any thoughts on that?
I am suspicious if Solrj4.4 dealt with solrcloud in the same way as Solrj5.x. 

I tried this method from the indexer code locally with few modifications and it works fine with Solr 5.2.0.
I was able to successfully add documents to Solr server.

private static void createSolrServers() throws SolrServerException, IOException {
       String solrMode = "cloud";
       if (solrMode.equals("cloud")) {
           String indexZkHost = "node1:2181,node2:2181,node3:2181/solr";
           String collectionName = "hpfcollection"; 
           CloudSolrServer solrServer = new CloudSolrServer(indexZkHost);
           
           int zkSessionTimeout = 3000;
           solrServer.setZkClientTimeout(zkSessionTimeout);
           solrServer.setZkConnectTimeout(zkSessionTimeout);      
           solrServer.setDefaultCollection(collectionName);
           
           SolrInputDocument doc = new SolrInputDocument();
           doc.addField("id", "12");
           doc.addField("content", "this is added programatically 1233");
           
           solrServer.add(doc, 10);
           Set<CloudSolrServer> server = Collections.singleton(solrServer);
           
       } else if (solrMode.equals("classic")) {
           /*
           PoolingClientConnectionManager connectionManager = new PoolingClientConnectionManager();
           connectionManager.setDefaultMaxPerRoute(getSolrMaxConnectionsPerRoute(indexConnectionParams));
           connectionManager.setMaxTotal(getSolrMaxConnectionsTotal(indexConnectionParams));
           HttpClient httpClient = new DefaultHttpClient(connectionManager);
           return new HashSet<SolrServer>(createHttpSolrServers(indexConnectionParams, httpClient));
           */
       } else {
           throw new RuntimeException("Only 'cloud' and 'classic' are valid values for solr.mode, but got " + solrMode);
       }
   }

On Fri, Apr 8, 2016 at 8:58 AM, Gabriel Reid <gabrie...@gmail.com> wrote:
Hi Ravi,

No, there wouldn't be anything related to hbase-indexer in the solr cloud dump.

However, the fact that the chroot (i.e. "/solr" suffix) on your zookeeper connection parameter is a likely cause of the problem here. The ZK nodes should indeed include the /solr suffix (and optionally the port number), as shown here: https://github.com/NGDATA/hbase-indexer/wiki/CLI-tools#add-indexer

- Gabriel

Gabriel Reid

unread,
Apr 14, 2016, 2:45:07 AM4/14/16
to Ravi K, hbase-ind...@googlegroups.com
Inlined below

On Tue, Apr 12, 2016 at 6:28 AM, Ravi K <rcka...@gmail.com> wrote:
hi Gabriel,

Thank you for your help.

I tried all combinations of zkHost setting with/without ports and with/without /solr, without any luck :(

Also compiled sourcecode (https://github.com/NGDATA/hbase-indexer) and deploy the jars but I run into different set of issues.

Could you elaborate on what you mean with "a different set of issues"?

And when you compiled the source, did you update the dependency versions (specifically of everything related to SolrJ) to exactly match the version of Solr that you're using?

As far as I can see, the error that you're getting is one of:
* a problem with connection configuration info to Solr
* version mismatch between SolrJ and the Solr server that you're using
* the collection that you're trying to write to doesn't exist

It sounds like option 1 and option 3 have been eliminated, but I'm not totally sure that option 2 can be counted out yet.

- Gabriel
 

Hope I am not missing something obvious, what else do you think the issue could be.

2016-04-11 23:18:06,382 INFO org.kitesdk.morphline.api.MorphlineContext: Importing commands
2016-04-11 23:18:07,081 INFO org.kitesdk.morphline.api.MorphlineContext: Done importing commands
2016-04-11 23:18:07,084 ERROR com.ngdata.hbaseindexer.indexer.DirectSolrInputDocumentWriter: Error updating Solr

org.apache.solr.common.SolrException: Could not find collection : hpfcollection
        at org.apache.solr.common.cloud.ClusterState.getCollection(ClusterState.java:162)
        at org.apache.solr.client.solrj.impl.CloudSolrServer.directUpdate(CloudSolrServer.java:305)
        at org.apache.solr.client.solrj.impl.CloudSolrServer.request(CloudSolrServer.java:539)
        at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:124)
        at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:116)
        at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:102)
        at com.ngdata.hbaseindexer.indexer.DirectSolrInputDocumentWriter.retryAddsIndividually(DirectSolrInputDocumentWriter.java:123)
        at com.ngdata.hbaseindexer.indexer.DirectSolrInputDocumentWriter.add(DirectSolrInputDocumentWriter.java:108)
        at com.ngdata.hbaseindexer.indexer.Indexer.indexRowData(Indexer.java:156)
        at com.ngdata.hbaseindexer.indexer.IndexingEventListener.processEvents(IndexingEventListener.java:99)
        at com.ngdata.sep.impl.SepEventExecutor$1.run(SepEventExecutor.java:97)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)




Ravi

On Fri, Apr 8, 2016 at 8:58 AM, Gabriel Reid <gabrie...@gmail.com> wrote:
Hi Ravi,

No, there wouldn't be anything related to hbase-indexer in the solr cloud dump.

However, the fact that the chroot (i.e. "/solr" suffix) on your zookeeper connection parameter is a likely cause of the problem here. The ZK nodes should indeed include the /solr suffix (and optionally the port number), as shown here: https://github.com/NGDATA/hbase-indexer/wiki/CLI-tools#add-indexer

- Gabriel
Reply all
Reply to author
Forward
0 new messages