Press "Enter" to skip to content

Category: Elasticsearch

Range Filtering in Elasticsearch

The Hadoop in Real World team shows off some Elasticsearch skills:

Filtering based on a range like greater than, less than, greater than equal etc. are pretty common requirements when you work with data. In this post we will see how to perform range based filtering with Elasticsearch.

Knowing the specific syntax makes it easy to follow along. And it does help that Powershell has similar comparison flags with -gt, -gte, and the like.

Comments closed

Conditional Expressions in Elasticsearch

The Hadoop in Real World team explains how to perform OR, AND, and NOT operations in Elasticsearch queries:

We can specify conditional expressions like OR, AND using the Query expression during search in Elasticsearch.

We have an index named account and in the index we have details of account owners including their name, address, age, sex, employer etc.

Let’s search the documents with AGE=25 and STATE IN (‘ca’, ‘ny’) in the index.

As a spoiler, it’s not as easy as using OR, AND, and NOT, though there are synonyms.

Comments closed

Elastic Beats and the ELK Stack

Shane Ducksbury explains where Elastic Beats fits in the ELK stack:

After my last blog post about Logstash, Elasticsearch, and Kibana, I wanted to investigate something else I kept coming across during my Logstash research: Elastic Beats.

Beats initially appeared to me to be a way to send data to Elasticsearch, the same as Logstash, leading me to wonder how Beats is different and where it fits in the ELK stack. In this blog, I’ll take a deeper look at Beats to understand how it works, what you might use it for, and how it compares with Logstash.

Read on to learn more about Elastic Beats and how this is quite different from Logstash.

Comments closed

Elasticsearch and SSPL

Vicky Brasseur looks at an announcement:

In a play to convert users of their open source projects into paying customers, today Elastic announced that they are changing the license of both Elasticsearch and Kibana from the open source Apache v2 license to Server Side Public License (SSPL). If your organisation uses the open source versions of either Elasticsearch or Kibana in its products or projects, it is now at risk of being forced to release its intellectual property under terms dictated by another.

Click through to understand the details. I’d imagine that if Elastic goes through with this, people would fork the last pre-SSPL version of their product sets and create a community spin-off, similar to MariaDB spinning off from MySQL.

Comments closed

Result Window Too Large in Elasticsearch

Samir Behara explains a common Elasticsearch error:

I have configured Error Logs for my Elasticsearch cluster, and I see a frequent error below in the logs —

org.elasticsearch.ElasticsearchException$1: Result window is too large, from + size must be less than or equal to: [10000] but was [15020]. See the scroll api for a more efficient way to request large data sets. This limit can be set by changing the [index.max_result_window] index level setting.

Click through to understand what the issue is and how you can resolve it.

Comments closed

Data Compression with Elasticsearch

Hakan Altindag takes us through compression options when working with Elasticsearch:

18th of June 2020 Elastic has released Elasticsearch 7.8 with their java library which makes handling compressed data easier, see here the release notes: Elasticsearch 7.8 release notes

Even though you enabled Elasticsearch to send you compressed data, Elasticsearch will only compress it when the client is requesting for it. The java client can request for it by sending additional request options within the http request, see below for an example:

Read on to see how to enable this, as well as how clients can use it.

Comments closed

Elasticsearch Backups

Guy Shilo shows how you can back up an Elasticsearch cluster:

Elasticsearch is facing the same challenge and it’s built-in backup method is snapshots. Unlike classic storage snapshots, Elasticsearch snapshot can be stored remotely on external storage systems, and that is supposed to enable them deal with large amounts of data.

Snapshots can be stored on a shared file system (mounted on all cluster nodes), on all major cloud storage providers (Amazon S3, Azure and GCS) and on HDFS.

The documentation can be found here.

Read on to see the demo. Even if your Elasticsearch data is not the home of record and you could rebuild the cluster doesn’t mean ignoring backups is wise.

Comments closed

Connecting Kafka to Elasticsearch

Danny Kay and Liz Bennett build an example of writing Kafka topic data to Elasticsearch:

The Elasticsearch sink connector helps you integrate Apache Kafka® and Elasticsearch with minimum effort. You can take data you’ve stored in Kafka and stream it into Elasticsearch to then be used for log analysis or full-text search. Alternatively, you can perform real-time analytics on this data or use it with other applications like Kibana.

For some background on what Elasticsearch is, you can read this blog post by Sarwar Bhuiyan. You can also learn more about Kafka Connect in this blog post by Tiffany Chang and in this presentation from Robin Moffatt.

This is a demo-heavy walkthrough, so check it out.

Comments closed

Amazon Elasticsearch Alerts

Jon Handler shows how to create alerts for Amazon Elasticsearch Service:

On April 8, Amazon ES launched support for event monitoring and alerting. To use this feature, you work with monitors—scheduled jobs—that have triggers, which are specific conditions that you set, telling the monitor when it should send an alert. An alert is a notification that the triggering condition occurred. When a trigger fires, the monitor takes action, sending a message to your destination.

This post uses a simulated IoT device farm to generate and send data to Amazon ES.

Click through for a demo.

Comments closed

Kafka In Front of ELK

Daniel Berman sets up a simple Elasticsearch-Logstash-Kibana (ELK) stack and throws Kafka in front of it:

To perform the steps below, I set up a single Ubuntu 16.04 machine on AWS EC2 using local storage. In real-life scenarios you will probably have all these components running on separate machines.

I started the instance in the public subnet of a VPC and then set up a security group to enable access from anywhere using SSH and TCP 5601 (for Kibana). Finally, I added a new elastic IP address and associated it with the running instance.

The example logs used for the tutorial are Apache access logs.

This is a great walkthrough on setup and basic configuration. If you don’t have something in place to manage logs, the ELK stack is fine.

Comments closed