We discovered during the implementation of a faceted search the following issue:
The count of a facet was 14 but when you add that facet to your search the count of results was 15. WTF?!
The index is quite simple: Just containing some none indexed field and one filter field containing all the filters. On that filter field the facet was added:
FACET_LIMIT was 10 at the beginning, and that is the issue:
Because what happens is that elasticsearch gets the top 10 facets from each shard. For example the value „A“ of that facet has a count of 14 and is in the top 10 of shard 01 but on shard 02 it has the count of 1 is not in the top 10. So the total count of that facet is 14 but if you do a real search for it, all shards a searched and a count of 15 is returned.
There is also an issue about that.
How to come over this:
- Just use one shard
=> Bad choice because of scalability and therfore only possible for small indices
- Raise the FACET_LIMIT
=> We did that because the index is small. But could also be a performance issue. We moved that to 3000 without any performance decrease.
- Change index format to avoid fields that have too much different values that are used for a facet
=> Probably the best choice
- Live with it 😉