Salem's Euphoria

Sharing Experience

Implicit updateRequestProcessorChain call – Solr

Leave a comment

How to concat two fields in schemaless mode with Solr in Cloud mode?

1 – Create a js script file (concat_fields.js) and edit the following code:


function processAdd(cmd) {
doc = cmd.solrDoc;
id = doc.getFieldValue("id");
val1= doc.getField('field1').getValue();
val2 = doc.getField('field2').getValue();
separator = params.get('separator');
doc.setField("field3", val1+separator+val2);
}

2 – Grab your solrconfig.cml from ZK :


./zkcli.sh -zkhost zkserver1:2181,zkserver2,zkserver3 -cmd get /configs/collection_config/solrconfig.xml > ~/<span 				data-mce-type="bookmark" 				id="mce_SELREST_start" 				data-mce-style="overflow:hidden;line-height:0" 				style="overflow:hidden;line-height:0" 			></span>solrconfig.xml

You can use the command “-cmd list” to find your solrconfig.xml inside zookeeper files tree.

3 – Edit your solrconfig.xml you just got from ZK and add the following lines.

 <updateRequestProcessorChain name="concatscript" >
      <processor class="solr.StatelessScriptUpdateProcessorFactory">
        <str name="script">concat_fields.js</str>
        <lst name="params">
          <str name="separator"> </str>
        </lst>
      </processor>
	  <processor class="solr.LogUpdateProcessorFactory" />
      <processor class="solr.RunUpdateProcessorFactory" />
    </updateRequestProcessorChain>

This xml chunk  defines an UpdateRequestProcessorChain called “concatscript“. It is based on “solr.StatelessScriptUpdateProcessorFactory” and runs “concat_fields.js” script on each update. The concat_fields.js script can receive parameters. In this example, “separator” is a custom param for this script.

One last important thing about this configuration is to add the “RunUpdateProcessorFactory” at the end of the chain so the changes could be written to Solr.

4 – In solrconfig.xml, find the “initParams” for update handlers and add this custom chain :

<initParams path="/update/**">
    <lst name="defaults">
	  <str name="update.chain">concatscript</str>
	  <str name="update.chain">add-unknown-fields-to-the-schema</str>
    </lst>
  </initParams>

The initParams for the update queries will now execute our script on each document add operation. Note that you will have to restart solr to get the new config.

If you don’t want to make this behaviour the default one, you can skip initParams (step4). In such case you will need to explicitly specify this update chain in your update query. This can be done via a http request as:

http://solr.server.com:8983/solr/mycollection/update?commit=true&stream.contentType=text/csv&fieldnames=id,field1,field2&stream.body=1,val1,val2&amp;update.chain=concatscript
But let’s update our configuration first. There are many ways to it, but I prefer the simplest as always.

5- Update the solrconfig.xml under zookeeper:

[language code=”shell”]

./zkcli.sh -zkhost zkserver1:2181,zkserver1:2181,zkserver3 -cmd putfile /configs/collection_config/solrconfig.xml ~/solrconfig.xml

[/code]

Note that I’m running zkcli.sh from it’s directory. You can easily find it using “grep -rnw” or “locate zkcli.sh”. I intentionnally did not mention the ZK port for server3 in the query just to remember you that’s the default one.

6 – Let’s upload the latest piece of the puzzle to ZK.


./zkcli.sh -zkhost zkserver1:2181,zkserver1:2181,zkserver1:2181 -cmd putfile /configs/my_config/concat_fields.js ~/concat_fields.js

7 – Restart Solr in cloud mode:


$SOLR_HOME/bin/solr stop -all

$SOLR_HOME/bin/solr start  -c -z zkserver1:2181,zkserver2:2181,zkserver3:2181

8 – Check the new configuration via Solr dashboard (the new one is more accurate 😉 )

2017-09-26_1217

9 – Run the update query without specifiying the update chain and check that field3 contains the expected value (here ‘val1 val2’):

http://solr.server.com:8983/solr/mycollection/update?commit=true&stream.contentType=text/csv&fieldnames=id,field1,field2&stream.body=1,val1,val2

 

By the way, I noticed that there’s a major problem in Solr Wiki these last days (September 2017): Sample code are not well rendered!

Hey  Doug Cutting! Something went wrong here.

2017-09-26_1226.png

 

Update :

2017-09-26_1240

Advertisements

Author: Salem Ben Afia

Big Data & Java developer Search Engine Architect, Lucene Expert

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s