Press "Enter" to skip to content

Month: June 2020

Methods for Deploying a dacpac

Erik Ejlskov Jensen shares some advice when deploying dacpac files:

I have previously blogged about using a SQL Server Database Project together with EF Core and also described a NuGet package that enables you to build a .dacpac with .NET Core, even on Linux and macOS.

So the two blog posts above cover development and build. Then next step is deployment.

The main deployment mechanism for making changes to your database based on your recently built .dacpac file, is the cross-platform sqlpackage command line tool.

You can, depending on your requirements, take advantage of several of the available actions this tool provides.

Read on to see two methods for deployment.

Comments closed

Organizing and Optimizing Power BI Dataflows

Marc Lelijveld wants your Power BI dataflows in tip-top shape:

In this blog I describe a few of the challenges you might face when you have a lot of dataflows. I will describe a few tips and tricks I am applying to sanitize your dataflow approach, organize dataflows and easy to browse through.

Dataflows are increasingly used as shared resource or staging layer inside the Power BI platform. While having dataflows, you can push down logic and reuse across different datasets. This lowers the impact on the source by extracting the data once from source to Power BI, helps in centralizing logic, having one version of the truth and lots of other advantages.

Read on for some tips and practices.

Comments closed

Understanding Monads in Scala

Anna Wykes continues a series on Scala for data engineers:

This is the second of my blogs in the Scala Parlour Series, in which we explore Scala, and why it is great for Data Engineering. If you haven’t already, please check out the first in the series here, in which you can read all about the core concepts of Scala, including who uses it and why 

In this article we will explore monads within the Functional Programming (FP) paradigm, and how they can be used in Scala to aid Data Engineering.  

Anna explains monads quite well here. This is a topic which is notoriously in how people perceive its difficulty, but conceptually it’s not as difficult as people take it to mean…if you understand a few concepts coming in.

Comments closed

Evolutionary Algorithms for Color Palette Discovery

Daniel Oehm combines two interests:

Colour theory is pretty complex stuff so choosing a good palette isn’t easy, let alone evolving one. So, you’re going to have some hits and some misses. This is definitely more for fun seeing what you discover rather than finding the perfect palette. Having said that you could discover some gold!

There are best practices when choosing a palette for data visualisation depending on the context and what is to be shown. For example people tend to respond to certain colours representing high / low, hot / cold or good / bad, there is also colourblindness considerations. evoPalette won’t necessarily adhere to these ideals.

I’d like to see a genetic algorithms approach, though you’d have to define some sort of function to score each outcome, so I can see how that’d be tricky. H/T R-Bloggers

Comments closed

Understanding Conditions Under Which Query Store Captures Queries

Milos Radivojevic dives into testing Query Store:

From SQL Server 2019, you can use the fourth option – CUSTOM which allows you to define what does „relevant query“ mean. For that purpose, you can use three parameters. A query will be captured by Query Store if it is executed at least EXECUTION_COUNT times or if for its compilation SQL Server used at least TOTAL_COMPILE_CPU_TIME_MS or the entire query execution used at least TOTAL_EXECUTION_CPU_TIME_MS CPU. All three parameters are evaluated within a period defined by using the fourth parameter – STALE_CAPTURE_POLICY_THRESHOLD with a default value of 1 hour.

You cannot define exceptions, i.e. queries that you do not want to see captured by Query Store, such as system queries or queries related to Query Store itself, you can just filter them out when you write queries against Query Store’s catalog views.

As mentioned earlier, in this post, I will go in details for the ALL option. This is default option in SQL Server 2019, and the decision to change from ALL to AUTO was a good one. However, the lack of documentation for this option can lead to mistrust and misunderstandings. Defaults are important for the feature’s newcomers; they will use defaults until they learn and test enough to probably choose better options for their workload.

There’s a lot of depth in this post, and Milos wraps up with a few rules of thumb.

Comments closed

Building a Docker Container of a SQL Server Database

I have a post showing how to turn a database in SQL Server into a Docker container:

Today, we’re going to go through the process of turning a database you’ve built into a Docker container. Before we get started, here are the expectations:

1. I want a fully running copy of SQL Server with whatever database I’m using, as well as key components installed.
2. I want this not to be on a persistent volume. In other words, when I destroy the container and create a new one from my image, I want to reset back to the original state. I’m using this for technical demos, where I want to be at the same starting point each time.
3. I want this to be as easy as possible for users of my container. I consider the use of a container here as not particularly noteworthy in and of itself, so the more time I make people think trying to set up my demo environment, the more likely it is that people will simply give up.

With that preamble aside, let’s get to work!

As a bonus, you can finally learn my real thoughts on medieval France. Fun story around that: a much longer time ago than I’m willing to admit, I played a Hundred Years War scenario in Civilization 2, and the one thing I remember from that scenario is killing the Dauphin. After that, the script spawned a new claimant to the throne, who immediately attacked my troops and died. And then the script spawned yet another new claimant, who met the same fate within a couple turns. And then a third. If I remember correctly, I ran France out of claimants to the throne by the end of it.

Comments closed

E-mailing Query Results via Logic Apps + Azure Data Factory

Rayis Imayev has to send an e-mail:

It is a very simple concept, you have a database query result, either a direct database table output or a stored procedure returned result that needs to be sent as email notification. This email could serve as a status report of your daily data processing job or alert notification with some metrics that require immediate attention, and you want to be as user-friendly as possible with this message aka HTML format for your dataset is required.

I’ve built this solution approach over a year go and wanted to document and share it after recently using it in one of my projects. Again, I agree, there other more or less effective ways to achieve the very same goal, this blog post is just a recollection of my efforts to send HTML formatted SQL query results using Azure Data Factory and Logic App.

It’s a bit more effort than sp_send_dbmail, though also considerably more flexible.

Comments closed

An Overview of the T-SQL Script DOM

Dan Guzman provides a public service:

Scripts are parsed by invoking the Parse method of T-SQL script DOM library TSqlParser class. The parser understands the complex T-SQL abstract syntax tree and splits T-SQL source into atomic TSqlParserTokens of TSqlTokenTypes that represent keywords, identifiers, punctuation, literals, whitespace, etc. These low-level tokens are grouped into more meaningful TSqlFragment objects that represent language elements of the script DOM, such as batches, statements, clauses, etc. Fragments, rather than the low-level parser tokens, are most often used in practice, although the underlying tokens are available for specialized requirements

The Parse method returns a TSqlFragment object of type TSqlScript containing all fragments within the script. This top-level fragment of the DOM hierarchy provides programmatic access to all language element fragments in the script. Nearly 1,000 different fragment types exist today due to the many granular T-SQL language elements.

Dan provides several examples of how to use the script DOM, making this a must-read if you’re interested in writing code around SQL Server.

Comments closed

Reducing Costs in Azure

Matt Robertshaw has a few tips for saving money in Azure:

4. Enterprise Dev/Test subscriptions

Enterprise Dev/Test subscriptions are provided as part of Microsoft Enterprise Agreements (EA).  They’re designed for teams of Visual Studio subscribers to run development and test workloads in Azure but at discounted rates, specifically on Windows virtual machines and exclusive gallery images.  This is significant because I see many clients using regular Azure subscriptions for development and test, which means they’re potentially paying more than they need to.

One additional point I’d make, though it kind of fits in with points 7 and 9, is to try to make your own services “cloud-first.” In other words, a typical cloud migration takes services built for on-prem data centers, converts the servers into VMs, ships those VMs up into Azure/AWS/Google/whatever, and then you end up paying more than you did on-prem.

Instead, read up a bit on cloud architecture and see how you might be able to change a service to fit that model. Instead of having a server running all the time, is it possible to store messages in a queue and have functions process these messages on a given schedule? Can you use expensive tools like Azure Synapse Analytics to perform nightly data processing and move the results to a much less expensive Azure SQL Database?

Comments closed