If I could teach SQL to analysts who plan to work in industry data science, I’d start by sharing a few SQL Truths I’ve learned, and why I recommend tracking SQL queries in git. Here goes:
- You will *always* need that query again
- Queries are living artifacts that change over time
- If it’s useful to you, it’s useful to others (and vice versa)
Focusing on these points has led to my continuous adoption of a query library — a git repository for saving and sharing commonly (and uncommonly) used queries, all while tracking any changes made to these queries over time.
This is separate from keeping database objects (like table or procedure definitions) in source control.
Another solution that can be used to track changes executed against your database is to read Transaction Log file or Transaction Log Backups. Transaction log file (and backups) consists of every transaction executed against your database. The only requirement for this is to have a database in the Full recovery model. In the Simple recovery model, every committed transaction can be very quickly overwritten by another one.
Also, this is something that you get for free without the need to enable and configure any additional SQL Server functionality. Of course, besides the database backups, but you already do database backups, right?
To test this approach, you have to make some preparations. You need to set the database’s recovery model to Full. As a first step, check database properties to ensure that it is appropriately configured. As a second step, you need to create a full database backup. From this point, the database is in Full recovery model, and every transaction is fully logged. Thanks to this you are able to read logged transactions from Transaction Log file. The same applies to read from Transaction Log backup. To do this, you need to create such a backup after you execute database schema changes.
I think I’ve only done this once or twice, but it’s a good technique to know about.
If you really like a certain R visual, you can also package it as a pbiviz file to share with others. Once you set up the foundation to create the first pbiviz, it is easy to crank out many more just by replacing the R code and repackaging it (into a different pbiviz file). See instruction here.
But this post isn’t about making charts. It turns out you can hijack the R visual to do lots of other things too. Below are a few examples:
Note: I am no R expert. The examples below are relatively simple and cobbled together from similar things online. They may be a little clunky, but worth it, in my opinion, to be able to dynamically leverage many more of the R capabilities through Power BI.
Read on for some interesting examples.
I would not say it’s common place to load structured data into the data lake, but I do see it frequently.
In most cases it is not necessary to first copy relational source data into the data lake and then into the data warehouse, especially when keeping in mind the effort to migrate existing ETL jobs that are already copying source data into the data warehouse, but there are some good uses cases to do just that:
There are some good reasons in here, so check them out.
In a past post here I did an overview of different index types. I said in that post that I think filtered indexes could be more popular. In this post I will cover fixing some of the problems caused when you first introduce rowstore filtered indexes to a SQL Server database.
Some of you have probably been there already. You’ve put in your first filtered index on a database only to find an issue has happened. I’ve witnessed these issues at a few places. This will hopefully reduce the pain.
I’ve definitely experienced the third issue (which also pops up when using parameterized queries, so the optimizer doesn’t know that it can use the filtered index), but never the first two.
Power Query is a user friendly ETL tool (Extract, Transform and Load). Traditionally ETL has been done using more complicated tools (such as SQL Server Integration Services – SSIS) and the resulting data is stored in a data mart or data warehouse for consumption by anyone that needs a standard view of the data. Power BI Desktop can consume tables directly from a data warehouse and simply load the table into Power BI – dead easy. But Power Query is also a powerful ETL tool in its own right, and it can be used to transform and reshape the source data directly inside Power BI Desktop (and then PowerBI.com). This is very useful if :
- You don’t have a data warehouse and/or
- You need some variation of what is in your data warehouse.
- You have other data sources that are not in a data warehouse but are still important to you.
Taking this approach (manipulate in Power Query) is perfectly fine if you have a single workbook, but what if you have 10 similar workbooks all needing the same transformation? Worse still, what if you are one of many people in a company all doing the same thing with multiple workbooks?
Read on for the solution.
Power BI comes with several built-in themes and a whole gallery full of custom themes available for download. But what if you still can’t find the perfect look for your reports? No problem! Just create your own custom Power BI themes 🙂
…sounds simple enough, right? It only takes a few minutes to create a custom Power BI theme with a color palette of your choice. Whoosh – instant custom branding!
But if you are like me, simple color changes might not be enough. Maybe you want finer control of borders, fonts, labels, or other visual elements. Or maybe you just don’t want to keep changing the same settings over and over and over again in multiple visualizations and reports. (Please don’t do that.)
You can control all of these things in custom Power BI themes. It is, however, not quite as simple as creating a color palette… yet. (You never know when the Power BI product team will blow your mind with a new update!) But for now, we need to define custom themes in JSON files.
Click through to learn how to do some of these changes through the power of editing JSON files.