Press "Enter" to skip to content

Cassandra Monitoring and Data Modeling

Instaclustr has put up a couple interesting posts on Cassandra. First, Anup Shirolkar explains how we can monitor Cassandra installations:

Cassandra is developed in Java and is a JVM based system. Each Cassandra node runs a single Cassandra process. JVM based systems are enabled with JMX (Java Management Extensions) for monitoring and management. Cassandra exposes various metrics using MBeans which can be accessed through JMX. Cassandra monitoring tools are configured to scrape the metrics through JMX and then filter, aggregate, and render the metrics in the desired format. There are a few performance limitations in the JMX monitoring method, which are referred to later. 

The metrics management in Cassandra is performed using Dropwizard library. The metrics are collected per node in Cassandra. However, those can be aggregated by the monitoring system. 

On the development side, the Instaclustr team walks us through data modeling guidelines:

The ultimate goal of Cassandra data modeling and analysis is to develop a complete, well organized, and high performance Cassandra cluster. Following the five Cassandra data modeling best practices outlined will hopefully help you meet that goal:

1. Cassandra is not a relational database, don’t try to model it like one
2. Design your model to meet 3 fundamental goals for data distribution
3. Understand the importance of the Primary Key in the overall data structure 
4. Model around your queries but don’t forget about your data
5. Follow a six step structured approach to building your model. 

Because Cassandra uses a variant of SQL, it’s easy to forget that data is stored completely differently and that design decisions are quite different from what we see in the relational world.