Press "Enter" to skip to content

On-Premises Scale-Out Post-Big Data Clusters

Chris Adkin looks at alternatives to SQL Server 2019 Big Data Clusters:

This post assumes that for reasons relating to data sovereignty, fiduciary or regulatory reasons in general that the:

– analytics platform will be underpinned by something which is cloud and on premises infrastructure agnostic, Kubernetes in other words.

– focal points of the Data Lake processing element will be Python and open source tools

– SQL Server 2022 S3 object virtualisation is the preferred technology for querying the Data Lake via a T-SQL surface area

– S3 is the preferred technology for storing the data in our Data Lake.

Read on for the high-level solution and stay tuned for more detailed answers.