Press "Enter" to skip to content

Category: Elasticsearch

Removing a Node from Elasticsearch

The Big Data in Real World team spams the delete button:

Shutting down a node abruptly is not the right way to decommission or remove a node from the Elasticsearch cluster. Doing so will cause your shards which are replicated to go down in replication and it could cause disruption to the clients who are currently consuming data from Elasticsearch.

Proper way to decommission or remove a node from Elasticsearch is to add the host to the exclusion list.

Click through to learn how to do this.

Comments closed

Creating an Alias in Elasticsearch

The Big Data in Real World team needs an alias:

An alias as the name suggests is an alias or another name to the index in Elasticsearch. It is quite useful when you want to refer to an index by another name. So instead of performing an reindex to rename or cloning an index you can create an alias to the index.

Click through for the script to create an alias, how you might use one, and the right way to delete one without removing the underlying article.

Comments closed

Migrating from Elasticsearch to Azure Data Explorer

Bhaskar Kakaraparthy does a logging switcharoo:

This article is an extension to an existing article to migrate data from Elastic Search to Azure Data Explorer (ADX) using Logstash pipeline as a step-step-step guide.  In this article, we will explore the process involved in migrating data from one source (ELK) to another (ADX) and discuss some of the best practices and tools available to make the process as smooth as possible.

Using Logstash for data migration from Elasticsearch to Azure Data Explorer (ADX) was a smooth and efficient process. With the help of ADX output plugin & Logstash, I was able to migrate approximately 30TBs of data in a timely manner. The configuration was straightforward, and the data transfer with ADX output plugin was quick and reliable. Overall, the experience of using ADX output plugin with Logstash for data migration was positive and I would definitely use it again for similar projects in the future.

Read on to see how.

Comments closed

Shipping Kafka Logs to Kibana via Filebeat

Shivani Sarthi uses Filebeat to perform log shipping:

To ship the Kafka logs, we will be using the filebeat agent. A filebeat agent is a lightweight shipper whose purpose is to forward and centralize the log data.

For filebeat to work, you need to install it as an agent on the desired servers. Filebeat then monitors the log files, collects the log events, and forwards them to the ElasticSearch or LogStash for indexing.

Click through for an Ansible script to install Filebeat, integrate with Kafka, and communicate with Logstash for eventual querying via Kibana.

Comments closed

Selective Document Copy in Elasticsearch

The Hadoop in Real World team show how to migrate specific documents when building a new index:

As shown in the other post, we still use a reindex by specifying the source and destination but this time we also specify a query in source along with the term which indicates that the documents with state = ‘ny’ will be filtered from the source.

So only documents with state ny will be copied to the new index account_v3 with this reindex operation.

Click through for an example of how this works.

Comments closed

OpenSearch 1.0 Released

Andrew Hopp, et al, announce version 1.0 of OpenSearch:

OpenSearch is a community-driven, open source search and analytics suite derived from Apache 2.0 licensed Elasticsearch 7.10.2 & Kibana 7.10.2. It consists of a search engine daemon (OpenSearch), a visualization and user interface (OpenSearch Dashboards), and advanced features from Open Distro for Elasticsearch like security, alerting, anomaly detection and more.

Click through for the full rundown.

Comments closed

Range Filtering in Elasticsearch

The Hadoop in Real World team shows off some Elasticsearch skills:

Filtering based on a range like greater than, less than, greater than equal etc. are pretty common requirements when you work with data. In this post we will see how to perform range based filtering with Elasticsearch.

Knowing the specific syntax makes it easy to follow along. And it does help that Powershell has similar comparison flags with -gt, -gte, and the like.

Comments closed

Conditional Expressions in Elasticsearch

The Hadoop in Real World team explains how to perform OR, AND, and NOT operations in Elasticsearch queries:

We can specify conditional expressions like OR, AND using the Query expression during search in Elasticsearch.

We have an index named account and in the index we have details of account owners including their name, address, age, sex, employer etc.

Let’s search the documents with AGE=25 and STATE IN (‘ca’, ‘ny’) in the index.

As a spoiler, it’s not as easy as using OR, AND, and NOT, though there are synonyms.

Comments closed