Press "Enter" to skip to content

Choosing a Data Serialization Format

Rathish Kumar says more than “JSON and Parquet”:

In the world of software, we often work with different types of data like lists, tables, and more. These data structures are designed to be fast and efficient when our computer programs use them. However, sometimes we need to move this data out of our computer’s memory, like when we want to save it to a file or send it over the internet. To do this, we have to change the data into a special format made up of 0s and 1s, which is quite different from data structures. This process is what we call encoding or serialization. 

In this article, we’ll explore the world of encoding and decoding, which is the reverse process of turning that special format back into usable data. We’ll also take a look at different ways to do encoding and decoding, as well as important things to think about when choosing the right method for your software projects.

Sadly, ORC (Optimized Row Columnar) doesn’t make the cut, as Parquet ended up taking over that market.