Elasticsearch 5.0

Itamar Syn-hershko looks at the new functionality in the latest version of Elasticsearch:

One fundamental feature of Elasticsearch is scoring – or results ranking by relevance. The part that handles it is a Lucene component called Similarity. ES 5.0 now makes Okapi BM25 the default similarity and that’s quite an important change. The default has long been tf/idf, which is both simpler to understand but easier to be fooled by rogue results. BM25 is a probabalistic approach to ranking that almost always gives better results than the more vanilla tf/idf. I’ve been recommending customers to use BM25 over tf/idf for a long time now, and we also rely on it at Forter for doing quite a lot of interesting stuff. Overall, a good move by ES and I can finally archive a year’s long advise. Britta Weber has a great talk on explaining the difference, and BM25 in particular, definitely a recommended watch.

This is one of several search-related features in the latest version.  Looks like a solid release.

Related Posts

Amazon Elasticsearch Alerts

Jon Handler shows how to create alerts for Amazon Elasticsearch Service: On April 8, Amazon ES launched support for event monitoring and alerting. To use this feature, you work with monitors—scheduled jobs—that have triggers, which are specific conditions that you set, telling the monitor when it should send an alert. An alert is a notification that the triggering condition occurred. […]

Read More

Kafka In Front of ELK

Daniel Berman sets up a simple Elasticsearch-Logstash-Kibana (ELK) stack and throws Kafka in front of it: To perform the steps below, I set up a single Ubuntu 16.04 machine on AWS EC2 using local storage. In real-life scenarios you will probably have all these components running on separate machines. I started the instance in the […]

Read More

Categories

November 2016
MTWTFSS
« Oct Dec »
 123456
78910111213
14151617181920
21222324252627
282930