Virtualize Data Or Move It?

James Serra contrasts data virtualization with traditional ETL moving data to a warehouse:

Data virtualization integrates data from disparate sources, locations and formats, without replicating or moving the data, to create a single “virtual” data layer that delivers unified data services to support multiple applications and users.

Data movement is the process of extracting data from source systems and bringing it into the data warehouse and is commonly called ETL, which stands for extraction, transformation, and loading.

If you are building a data warehouse, should you move all the source data into the data warehouse, or should you create a virtualization layer on top of the source data and keep it where it is?

Read on for James’s thoughts.

Related Posts

Modeling Semi-Additive Measures

Paul Poco shows a couple techniques for modeling semi-additive measures in Analysis Services and Power BI: As mentioned earlier, the most commonly encountered approach is Option 2, the snapshot fact table. The main drawback of this approach is that the fact table’s size will grow extremely fast. For example, if you want to calculate the headcount in a company with 10,000 employees on average, and you want 5 […]

Read More

Don’t Truncate Facts and Dimensions when Loading Data

Meagan Longoria explains why a truncate-and-reload strategy for data warehouses isn’t a good look: Every once in a while, I come across a data warehouse where the data load uses a full truncate and reload pattern to populate a fact or dimension. While it may not be the end of the world for a small […]

Read More

Categories

February 2018
MTWTFSS
« Jan Mar »
 1234
567891011
12131415161718
19202122232425
262728