A subset of related tables in a relational schema can satisfy any number of queries known and unknown at design time. Refactoring the schema into one Cassandra table to answer a specific query, though, will (re)introduce all the data redundancies the original design had sought to avoid.
In this series, I’ll do just that. Starting from a normalized SQL Server design and statement of the Cassandra query, I’ll develop four possible solutions in both logical and physical models. To get there, though, I’ll first lay the foundation.
This initial article focuses on the Cassandra primary key. There are significant differences from those in relational systems, and I’ll cover it in some depth. Each solution (Part III) will have a different key.
Cassandra (as well as Riak, while that was still a thing people cared about) has the concept of tables and SQL statements to work with them, but it’s quite different from a relational database, different enough that new design patterns are necessary. Just about the worst thing you could do would be to drop your relational database schema in Cassandra and call it a day.