This is a quick post to share how we can use the coalesce operator in Azure DocumentDB (which is a schema-free, NoSQL database) to handle situations when the data structure varies from file to file. Varying data structure is a common issue in big data and analytics projects. A schema-free database like DocumentDB allows us to ingest and store the data with varying structures without a lot of upfront effort. However, accommodating these varying data structures is challenging later when we want to analyze the data. When querying the data (think Schema on Read here), I do need to impose a consistent structure on the data to perform analytics.
Read the whole thing.