On-Premises Scale-Out Post-Big Data Clusters

Chris Adkin looks at alternatives to SQL Server 2019 Big Data Clusters:

This post assumes that for reasons relating to data sovereignty, fiduciary or regulatory reasons in general that the:
– analytics platform will be underpinned by something which is cloud and on premises infrastructure agnostic, Kubernetes in other words.
– focal points of the Data Lake processing element will be Python and open source tools
– SQL Server 2022 S3 object virtualisation is the preferred technology for querying the Data Lake via a T-SQL surface area
– S3 is the preferred technology for storing the data in our Data Lake.

Read on for the high-level solution and stay tuned for more detailed answers.

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31