Press "Enter" to skip to content

Month: April 2020

Dataflows as an Alternative to Incremental Loading in Power BI

Imke Feldmann gives us an alternative to Power BI’s incremental loading using Dataflows:

If you’ve been following my blog for a while, you might have noticed my interest in incremental load workarounds. It took some time before we saw the native functionality for it in Power BI and it was first released for premium workspaces only. Fortunately, we now have it for shared workspaces / pro licenses as well and it is a real live saver for scenarios where the refresh speed is an issue.

However, there is a second use case for incremental refresh scenarios that is not covered ideally with the current implementation. This is where the aim is to harvest and store data in Power BI that will become unavailable in their source in the future or one simply wants to create a track of changes in a data source. Chris Webb has beaten me to this article here and describes in great detail how that setup works. He also mentions that this is not a recommended setup, which I agree. Another disadvantage of that solution is that this harvested data is only available as a shared dataset instead of a “simple” table. This limits the use cases and might force you to set up these incremental refreshes in multiple datasets.

Read on for more information.

Comments closed

Metadata-Only Column Changes with SQL Server 2016

Paul White takes us through several metadata-only changes which SQL Server 2016 introduced:

These changes can be metadata-only because the underlying binary data layout does not change when Column Descriptor row format is used (hence the need for compression). Without compression, row store uses the original FixedVar representation, which cannot accommodate these fixed-length data type changes without rewriting the physical layout.

You may notice that tinyint is omitted from the integer types list. This is because it is unsigned, while the other integer types are all signed, so a metadata-only change is not possible. For example, a value of 255 can fit in one byte for tinyint, but requires two bytes in any of the signed formats. The signed formats can hold -128 to +127 in one byte when compressed.

This is very interesting, but note the long list of requirements for it to work, notably that compression must be enabled on all indexes and partitions.

Comments closed

Documenting SQL Server Tables

Phil Factor has a way to create table documentation in source control and propagate it to the actual database:

It has always been a problem that documentation in the source, where it should be, is not then passed into the live database when the build script is executed. In a table, you have columns, constraints and indexes that you are likely to document using line-ending comments and block comments. You probably have a big block comment at the start, explaining the table. This information should be available in the live database. Microsoft don’t have a good answer and vaguely go on about adding comments in extended properties. Well, that’s fine but it hasn’t happened, unsurprisingly: Have you ever tried to do it? It is an almost impossible task, even with SQL Doc.

My solution is to execute my finely-documented build script as usual to create the latest version of the database, and then process the same script in PowerShell to add all the comments and documentation as extended properties in the right place in the live database.

It’s an interesting approach to a classic problem.

Comments closed

Diagnosing Out of Memory Failures with SQL Server 2017

Lonny Niederstadt had a curious issue:

When [Max Server Memory] can be attained by SQL Server, [Target Server Memory] will be equal to [Max Server Memory].  If memory conditions external to SQL Server make [Max Server Memory] unattainable, [Target Server Memory] will be adjusted downward to an attainable value. As workload is placed on the system, [Total  Server Memory] grows toward [Target Server Memory].  That’s typical, expected behavior.

In this case, the story stays boring until after the following midnight.  There wasn’t enough workload to drive much growth of [Total Server Memory] until about 2:15 am, after which [Total Server Memory] grew fairly rapidly.  [Total Server Memory] reached a plateau between 3:00 am and 3:15 am, and then stepped slightly down later.  [Target Server Memory] was never attained.  That’s curious.

Indeed it is. Read the whole thing. And, given that it is labeled as Part 1, stay tuned for Part 2.

Comments closed

Explaining SQL_VARIANT

Kenneth Fisher explains what the SQL_VARIANT data type is used for:

The SQL_VARIANT data type is an interesting beast. It is a data type that can store most types of data. So a datedecimalintvarchar etc. can be stored in this single data type. This sounds great right? Well there are a few issues here.

I don’t think I’ve ever used SQL_VARIANT data types before. It always struck me as a refuge for not wanting to think about what the proper data type should be. In fairness to it, though, I’ve seen plenty of unthoughtful solutions using NVARCHAR as well.

Comments closed

Querying Database and Log File Sizes with T-SQL

Allen White takes us through an easy technique to check database and log file sizes:

As a consultant, I have to be able to quickly spot problems, and one of the problems I frequently find is transaction log files that are incorrectly sized.

There are two catalog views in the master database which make this easy to do – sys.master_files and sys.databases. The sys.master_files view contains the database and individual file names, and the data_space_id column always has a value of 0 for the log file. The size column returns the value in 8KB pages, so we have to multiply the column by 8, then divide by 1024 to get the size in megabytes (MB).

Click through for the demo.

Comments closed

Inserting Geospatial Data into Cosmos DB

Hasan Savran shows us how we can use the .NET SDK to insert geospatial data into Cosmos DB:

GeoSpatial Data can help you to answer many questions in your business If you know how to use Spatial data. Searching data by radius can bring you all kind of interesting data. For example, If you know the path of hurricane, you can make searches by using the path and find all your customers under that path. Then you can be proactive and do something about this upcoming problem for your business.

     Many databases support Spatial Data Types, I will cover how to store Spatial Data in Azure Cosmos DB in this post. I have an earlier post about how to import Spatial Data into Azure Cosmos DB by using Data Migration Tool. I will focus on how to store spatial data by using .NET SDK in this post. I used .NET SDK 3.8.0, you can get the latest SDK from here..

Click through for a demonstration.

Comments closed

The Importance of Gridlines

Stephanie Evergreen shows why (subtle) gridlines are so important in visuals:

Here’s the thing: This chart NEEDS gridlines. I’ve said this before but I find this anti-gridline trend so common that I need to address this topic explicitly.

The *medium gray not black* gridlines are necessary because I do not have data labels on every one of the dots in the chart. 

A quick reminder is that even Edward Tufte (a key proponent of the “gridlines are bad” school) doesn’t hate all gridlines. Subtlety is key with them: they should be there when you need them but easily ignored when you don’t.

Comments closed

Working with Jupyter Books in Azure Data Studio

Jamie Wick takes us through using Jupyter Book in Azure Data Studio:

The first thing to know is that Jupyter Books and “Jupyter Book support” (in Azure Data Studio) are slightly different concepts. Jupyter Books let you build web-based collections of Jupyter notebooks. Jupyter Books support allows you to build collections of Jupyter notebooks on your local computer or network (ie. not web-based). Additionally, all of the standards and functionality of the online Jupyter Books may not be fully supported/implemented in Azure Data Studio.

Click through for what this means as well as what the March 2020 release brought us.

Comments closed

Bulk Migration of Data and Log Files in SQL Server

David Fowler shows how you can change file paths for all of your databases in one fell swoop:

You’ve got a SQL Server with a few hundred databases on it (to be honest it doesn’t even need to be quite that many) and you need to move all the data and log files to a new location. Perhaps you’re going to be migrating onto a new, shiny SAN or maybe your disks are just about full and you need to shift a bunch of the files off somewhere else.

The first thing that you’re going to need to do is change the paths of the files in SQL. That’s easy enough to do with an ALTER DATABASE statement.

ALTER DATABASE SQLUndercover MODIFY FILE (NAME SQLUndercover_Log, FILENAME = 'F:\SQLLogs\SQLUndercover_Log.ldf'

But that’s going to get very tedious very quickly if you’ve got to do that for a whole lotta databases. So to help out, I thought I’d share a little script that I’ve been using for a while (or a variation on it at least) to make the process far easier and generate all the ALTER statements for you.

Click through for the script as well as a bit of advice around the actual moving of the files.

Comments closed