How many times have you heard an executive request a panel with the company’s sales data in real time? How frequently has this single request – which is more often a preference than an important business requirement – affected the overall architecture of your analytical solution?
In the Power BI world, requirements for real time often drive the creation of a pure DirectQuery model, with no aggregations to avoid data latency. This choice is incredibly expensive: the computational cost of each individual query is borne by the data source, which is often a relational database like SQL Server. On top of its cost, with this approach you will face scalability, performance, and modeling issues. Indeed, the relational database on top of which DirectQuery runs is mostly designed for transactional processing instead of being optimized for the workload of analytical processes. Optimizing the model is both difficult and expensive. Finally, using DirectQuery creates specific modeling constraints and the need for modeling workarounds to obtain good performance.
Creating an entire model using DirectQuery for the sole purpose of achieving a few real-time dashboards is definitely excessive. The primary scenario where relying on DirectQuery makes sense is when it is not feasible to import data quickly enough to satisfy the latency requirements for the majority of the reports. When the entire model can be in import mode, and a small number of dashboards require DirectQuery, there are better options available.
Definitely worth the read.