Full and Incremental Loads in Microsoft Fabric

John Miner continues a series on data engineering in Microsoft Fabric:

In a data lake, we have a bronze quality zone that supposed to represent the raw data in a delta file format. This might include versions of the files for auditing. In the silver quality zone, we have a single version of truth. The data is de-duplicated and cleaned up. How can we achieve these goals using the Apache Spark engine in Microsoft Fabric?

Read on for John’s take on the answer. I’ve found that I have a fairly good answer for smaller datasets, though as the size of the data gets larger, the less I like answers for the raw layer.

M	T	W	T	F	S	S
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29