Introduction To ElasticSearch

Hasan Rahhal gives a good introduction to ElasticSearch:

On the top level is the total number of the docs using an empty search query, and max_score is the maximum score a document can take in a specific query. In our case it’s one, since no query was specified.

In __shards.total_ the value is the number of Lucene indexes that Elasticsearch created for that index. The default number is always 5 unless we specify otherwise on index creation time. More details about shards are explained here.

ElasticSearch is designed to store things like logs and monitoring metrics, and the interface is JSON.  This makes it very useful for certain tasks and infuriatingly difficult to do other things (like advanced aggregations).  Still, in a medium-sized or larger environment, this is probably a technology you either are using today or want to use.

Related Posts

Connecting To Elasticsearch With R

Jerod Johnson has a sample of connecting to Elasticsearch with R: You will need the following information to connect to Elasticsearch as a JDBC data source: Driver Class: Set this to cdata.jdbc.elasticsearch.ElasticsearchDriver. Classpath: Set this to the location of the driver JAR. By default, this is the lib subfolder of the installation folder. The DBI functions, […]

Read More

Writing To Elasticsearch With Spark Streaming

Anuj Saxena has an example of writing data from a Spark Streaming pipeline out to Elasticsearch: There’s been a lot of time we have been working on streaming data. Using Apache Spark for that can be much convenient. Spark provides two APIs for streaming data one is Spark Streaming which is a separate library provided […]

Read More


May 2016
« Apr Jun »