Blog Archives

Setup HBase Indexer (Part 2)

1 – Why would someone use Solr to search on a wide-column database (HBase)?

The power of HBase search (scans) is not filters. All is about the rowkey design. If you want to take full advantage of HBase, you must know all your search queries at the moment of deigning your database. This way, you will put all the “search” intelligence in your rowkeys. But what if you don’t know all your search criteria at the beginning? What if you need to add extra search criterias? Would you create a new “view” of data with another rowkey strategy? What would you do if your client needs to search by “proximity” or a did you mean style?

There is no answer for this question than “it depends”.

 

2 – Why we did not use Ambari for Solr deployment?

It is not integrated offcially, it does not bring any added-value, it adds some more complexity in ambari-agents scripts (must be altered manually for this use case).

Read the rest of this entry

Advertisements

HBase, Zookeeper and Solr

 

If Solr and HBase are not on the same machine (distributed architecture), you will probably face this ZK problem:

[WARN ][15:14:42,409][host:2181)] org.apache.zookeeper.ClientCnxn - Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
 at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
 at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
 at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
 at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)

java.io.IOException: Failed to connect with Zookeeper within timeout 30000, 
connection string: localhost:2181
 at com.ngdata.hbaseindexer.util.zookeeper.StateWatchingZooKeeper.<init>(StateWatchingZooKeeper.java:109)
 at com.ngdata.hbaseindexer.util.zookeeper.StateWatchingZooKeeper.<init>(StateWatchingZooKeeper.java:73)
 at com.ngdata.hbaseindexer.cli.BaseIndexCli.connectWithZooKeeper(BaseIndexCli.java:92)
 at com.ngdata.hbaseindexer.cli.BaseIndexCli.run(BaseIndexCli.java:79)
 at com.ngdata.hbaseindexer.cli.AddIndexerCli.run(AddIndexerCli.java:50)
 at com.ngdata.hbaseindexer.cli.BaseCli.run(BaseCli.java:69)
 at com.ngdata.hbaseindexer.cli.AddIndexerCli.main(AddIndexerCli.java:30)

Notice here that HBase-indexer is trying to reach the localhost server of Zookeeper.
Do not to forget –zookeeper param in distributed HBase-Indexer setup:
/opt/lucidworks-hdpsearch/hbase-indexer/bin/hbase-indexer add-indexer -n hbaseindexer -c /opt/lucidworks-hdpsearch/hbase-indexer/indexdemo-indexer.xml -cp solr.zk=hmaster:2181 -cp solr.collection=hbaseCollection –zookeeper hmaster:2181

Read the rest of this entry