Investigating Azure Data Explorer

James Serra digs into how you can use Azure Data Explorer:

Azure Data Explorer (ADX) was announced as generally available on Feb 7th.  In short, ADX is a fully managed data analytics service for near real-time analysis on large volumes of data streaming (i.e. log and telemetry data) from such sources as applications, websites, or IoT devices.  ADX makes it simple to ingest this data and enables you to perform complex ad-hoc queries on the data in seconds – ADX has speeds of up to 200MB/sec per node (currently up to 3 nodes) and queries across a billion records take less than a second.  A typical use case is when you are generating terabytes of data from which you need to understand quickly what that data is telling you, as opposed to a traditional database that takes longer to get value out of the data because of the effort to collect the data and place it in the database before you can start to explore it.

It’s a tool for speculative analysis of your data, one that can inform the code you build, optimizing what you query for or helping build new models that can become part of your machine learning platform.  It can not only work on numbers but also does full-text search on semi-structured or un-structured data.  One of my favorite demo’s was watching a query over 6 trillion log records, counting the number of critical errors by doing a full-text search for the word ‘alert’ in the event text that took just 2.7 seconds.  Because of this speed, ADX can be a replacement for search and log analytics engines such as elasticsearch or Splunk.  One way I heard it described that I liked was to think of it as an optimized cache on top of a data lake.

Click through for James’s explanation and where you might want to use ADX.

Related Posts

Using the StreamSets Snowflake Destination

Dash Desai shows how you can use StreamSets to write data into SnowflakeDB: In particular, we’ll look at an example scenario that addresses Data Drift – where new information is added mid-stream and when that occurs the new table structure and new column values are created in Snowflake automatically. To illustrate, let’s take HTTP web server logs […]

Read More

Accessing Azure Event Hubs with Python

Neil Gelder shows us how you can write Python code to work with Azure Event Hubs: I’ve supplied these two python scripts in my github repo at the following link. First we need to open the install the relevant python libraries so you’ll need to issue the below pip command in whatever command tool you use, […]

Read More

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.


March 2019
« Feb