Elasticsearch 5.0

Itamar Syn-hershko looks at the new functionality in the latest version of Elasticsearch:

One fundamental feature of Elasticsearch is scoring – or results ranking by relevance. The part that handles it is a Lucene component called Similarity. ES 5.0 now makes Okapi BM25 the default similarity and that’s quite an important change. The default has long been tf/idf, which is both simpler to understand but easier to be fooled by rogue results. BM25 is a probabalistic approach to ranking that almost always gives better results than the more vanilla tf/idf. I’ve been recommending customers to use BM25 over tf/idf for a long time now, and we also rely on it at Forter for doing quite a lot of interesting stuff. Overall, a good move by ES and I can finally archive a year’s long advise. Britta Weber has a great talk on explaining the difference, and BM25 in particular, definitely a recommended watch.

This is one of several search-related features in the latest version.  Looks like a solid release.

Related Posts

Kafka Connect To Elasticsearch

Robin Moffatt shows how to take data from Kafka Connect and feed it into Elasticsearch: Whilst Kafka Connect is part of Apache Kafka itself, if you want to stream data from Kafka to Elasticsearch you’ll want the Confluent Open Source distribution (or at least, the Elasticsearch connector). The configuration is pretty simple. As before, see inline comments for […]

Read More

Grafana On Elasticsearch

Daniel Berman shows how to replace Kibana with Grafana: While very similar in terms of what can be done with the data itself within the two tools. The main differences between Kibana and Grafana lie in configuring how the data is displayed. Grafana has richer display features and more options for playing around with how […]

Read More

Categories

November 2016
MTWTFSS
« Oct Dec »
 123456
78910111213
14151617181920
21222324252627
282930