Rolling A Log Analytics System

Michael Sun and Jeff Shmain put together a log analytics sytem using several technologies:

This is an example of tiered system design. Tiered system is a system design pattern where data is categorized and stored in different data stores that match best to each category. It can both improve performance and lower the cost of a system. One of the most famous tiered system designs is computer memory hierarchy.  In the log analytics use case, analysts mostly search for logs in recent months, but often run batch jobs to get long term trends from logs in recent years. Therefore, recent logs are indexed and stored in Solr for search, while years of logs are stored in HBase for batch processing. As such, the index in Solr is small, which both improves performance and reduces cost, among other benefits.

Although only months of logs are stored in Solr, the logs before that period are stored in HBase and can be indexed on demand for further analysis.

Now that we have covered a high level architecture of a log analytics system, we will dive into more details of individual components.

This looks like a solid architecture for a logging system and can apply to other cases as well.

Related Posts

Joins With Kafka

Florian Trossbach and Matthias J Sax show the various sorts of joins offered in Kafka, both streams and tables: Apache Kafka’s Streams API provides a very sophisticated API for joins that can handle many use cases in a scalable way. However, some join semantics might be surprising to developers as streaming join semantics differ from […]

Read More

Polybase And RPC Protection

Casey Karst announces that Polybase supports Hadoop RPC protection: Supporting this configuration allows PolyBase to connect and query Hadoop clusters that have wire encryption turned on. This enables a secure connection between Hadoop and SQL Server; as well as, among the Hadoop Data Nodes. To connect to a Hadoop cluster with the hadoop.rpc.protection set to […]

Read More

Categories

April 2017
MTWTFSS
« Mar May »
 12
3456789
10111213141516
17181920212223
24252627282930