Multi-select faceting is a new feature in the, soon to be released, Solr 1.4. It introduces support for tagging and excluding filters which enables us to request facets on a super-set of results from Solr.
The Problem
Out-of-the-box support for faceted search is a very compelling enhancement that Solr provides on top of Lucene. I highly recommend reading through the excellent
article by Yonik on faceted search at
Lucid Imagination's website, if you are not familiar with it.
Faceting on a field provides a list of (term,document-count) pairs for a given field. However, the returned facet results are always calculated on the
current resultset. Therefore, whatever the current results are, the facets are always in sync with the results. This is both an advantage as well as a disadvantage.
Let us take the search UI for finding used vehicles on the
Vast.com website. There are facets on the seller's location and the vehicle's model. Let us assume that the Solr query to show that page looks like the following:
q=chevrolet&facet=true&facet.field=location&facet.field=model&facet.mincount=1

What happens when you select a model by clicking on, say "Impala"? The facet for vehicle model disappears. Why? The reason is that now only "Impala" is being shown and there are no other models present in the current result set. The Solr query looks like the following now:
q=chevrolet&facet=true&facet.field=location&facet.field=model&facet.mincount=1&fq=model:Impala

So what is wrong with this? Nothing really. Except that for ease of navigation, you may still want to show all other models and document-counts which were being shown in the super-set of the current results (the previous page). But, as we noted a while back, the facets are shown for the current result set, in which all the models are Impala. If we attempt to facet on models field with the filter query applied, we will get a list of all models. But, except for "Impala", all other models will have a zero document count.
Solution #1 - Make another Solr query
Make another call to Solr without the filter query to get the other values. Our example query would look like:
q=chevrolet&facet=true&facet.field==model&facet.mincount=1&rows=0
The rows=0 is specified because we don't really want the actual results, just the facets for the model field. This is a solution that can be used with any version of Solr. However, it is one additional HTTP request. Even though it is a bit inconvenient, this is usually fast enough. However, an additional call is expensive if you are using
Solr's Distributed Search which will send one or more queries to each shard.
Solution #2 - Tag and exclude filters
This is where multi-select faceting support comes in handy. With Solr 1.4, it is possible to tag the filter queries with a name. Then we can exclude one or more tagged queries when requesting for facets. All of this happens through additional metadata that is added to request parameters through a syntax called Local Params.
Let us go step-by-step and change the query in the above example and see how the request to Solr will look like.
1. The original request in the above example without tagging:
q=chevrolet&facet=true&facet.field=location&facet.field=model&facet.mincount=1&fq=model:Impala
2. The filter query tagged with 'impala':
q=chevrolet&facet=true&facet.field=location&facet.field=model&facet.mincount=1&fq={!tag=impala}model:Impala
3. The facet field with the 'impala' filter query excluded:
q=chevrolet&facet=true&facet.field=location&facet.field={!ex=impala}model&facet.mincount=1&fq={!tag=impala}model:Impala
Now, with this one query, you can get the facets for current results as well as for the super-set without the need to make another call to Solr. If you want Solr to return this particular facet field under an alternate name, you can add a 'key=alternative-name' local param. For example, the following Solr query will return the 'models' facet under the name of 'allModels':
q=chevrolet&facet=true&facet.field=location&facet.field={!ex=impala key=allModels}model&facet.mincount=1&fq={!tag=impala}model:Impala
Tagging, filtering and renaming is not just limited to facet fields. It can be used with facet queries, facet prefixes and date faceting too.
This is another cool contribution by Yonik (also see my
previous post). I'm really looking forward to the Solr 1.4 release. It is bringing a bunch of very useful features including the super-easy-to-setup Java based replication. But more on that in a later post.