Press "Enter" to skip to content

Curated SQL Posts

Architecting A Power BI Environment

Reza Rad explains different architectural patterns for a Power BI implementation:

Implementing a Power BI solution is not just about developing reports, creating a data model, or using visuals. Power BI, like any other technologies, can be used in a correct, or incorrect way. Any technology can be used more effective if it harnesses the right architecture. A right architecture can be achieved after a requirement gathering and designing aspects and components of the technology to fit the requirement. In this post, you will learn about some of the most common architectures to use Power BI. You will learn about using Power BI in different architecture guidelines;

  • Sharing architecture

  • Self-service architecture

  • Enterprise architecture

Read on to learn more about these three patterns.

Comments closed

Operator Precedence In Complex Check Constraints

Louis Davidson walks us through the case of the disappearing parentheses:

You will see the code is not the same as when I coded it:

Looking as this, one thing stands out to the eye: the parenthesis have been removed from the comparisons, but added to the literal value of 2. (Also, everything is surrounded by square brackets, but as awful as that looks, it is a common thing to see in code that is generated/parsed, such as this or a query plan.) When the person who wrote the similar code saw this, they were concerned it was wrong, then the next person who did a code review was also concerned it was wrong, and when I saw it, I was sure it was right, but only because I had a bit more faith that if this were a problem with SQL Server’s toolset, I would probably have read about it!

Read on as Louis explains the logic.  My preference would be to retain the parentheses to make it easier for humans to follow.

Comments closed

The Basics Of Lambda Calculus

Kevin Sookocheff walks us through some of the basics of Lambda calculus:

Functions are a bit more complicated. Michaelson states that a λ function serves as an abstraction over a λ expression, which isn’t that informative unless we take some time to understand what abstraction actually means.

Programmers use abstraction all the time by generalizing from a specific instance of a problem to a parameterized version of it. Abstraction uses names to refer to concrete objects or values (you can call them parameters if you like), as a means to create generalizations of specific problems. You can then take this abstraction (you can call it a function if you like), and replace the names with concrete objects or values to create a particular concrete instance of the problem. Readers familiar with refactoring can view abstraction as an “Extract Method” refactoring that turns a fragment of code into a method with parameters that explain the purpose of the method.

I think having a good understanding of Lambda calculus is a huge advantage for a data platform professional, as it gives you an inroad to learning data-centric functional programming languages (e.g., Scala, R, and F#) and neatly sidesteps the impedance mismatch problem with object-oriented languages.

Comments closed

Findings From The Field: Reviewing Monitored Servers

Brent Ozar shares some findings from his SQL ConstantCare service:

Let’s look at a sample day from last week when 562 servers sent in data (out of the users who agreed to share their data to improve recommendations):

  • No one was using the -x startup flag

  • No one had known-dangerous third party modules installed

  • No one was running a 32-bit SQL Server

  • No one had high memory usage for in-memory OLTP (Hekaton)

  • No one’s plan cache had been erased recently (aside from server restarts)

  • No one needed to restart to fix a TempDB file configuration mismatch in the DMVs

  • Everyone on 2016 Standard Edition had patched to SP1 to get the free features

Read on to see the rules that customers are struggling with the most, as well as some thoughts on a scenario where a broken rule might not be that important.

Comments closed

Finding Invoices Containing A Product

Matt Allington shows how to use Power Query to get details on a top-level item (including all child items) which contains a particular child item:

As you can see in the image below, when a slicer is added to the product table and then you select an item in the slicer, the invoice detail table ONLY shows that single selected product.  But the requirement it to see ALL the items on ALL the invoices that contain that selected product.

Read on, as the solution definitely isn’t trivial.

Comments closed

SQL Server Disk Space Analysis

Jana Sattainathan has a solution using Powershell, Power BI, and T-SQL to track disk usage across a series of SQL Server instances:

This is just great on its own as I get information shown below. It is basically data like this at the Database/FileGroup/File levels

  • Database/Filegroup/File name

  • Size

  • Free size

  • Max size

  • Free of Max size

  • Size as a percent at the instance level

  • Free size as a percent at the instance level

  • Max size as a percent at the instance level

  • Free of Max size as a percent at the instance level

Read on to see how Jana makes use of this data, as well as where you can get the code.

Comments closed

Monitoring InfluxDB Using Telegraf

Tracy Boggiano continues her performance metrics series by showing how to monitor the monitoring solution:

We need to generate a special config file for the Linux machine to capture the CPU and disk metrics. This config file will be in your /etc/telegraf folder.

After you run this you will need to open it with your favorite editor, mine is vim:

Once the file the is open you can type the letter i to be able to edit the file. Then type /influxdb to find the outputs.influxdb section of the file and edit the same sections we edited for Windows. Type /inputs.outfluxdb to jump down and edit the InFluxdb section. After you have finished editing the file in vim you can type Esc then :wq! and the hit Enter for it save the config file.

Click through to check out her solution.

Comments closed

Notes On Automating Automatic Indexing

Grant Fritchey shares with us some of his findings with automatic indexing on Azure SQL Database:

What you’ll notice is that several of the queries are filtering on the FirstName column. There’s no good index there. If you look at the execution plans for those queries you’ll also note the Missing Index suggestion. That suggestion is a necessary part of the automatic indexing. Yeah, missing indexes. I know. They’re not always accurate. It’s just a suggestion. Blah, blah, blah. I hear you.

The magic is not supplied by missing indexes. The magic is supplied by lots of data. Microsoft can take advantage of three things. Yes, missing index suggestions is first. Then, they can use the query metrics gathered in Query Store to see the behavior of your queries over time. Finally, they can use machine learning algorithms to determine if indexes will be helpful and measure how helpful they’ve been if one gets added. It’s great stuff. Go and read on it.

Click through for more notes, as well as a Powershell script you can use to replicate his findings.

Comments closed

Graph Algorithms Supported In Neo4j

Amy Hodler gives us a quick summary of fifteen separate algorithms for traversing a graph in Neo4j:

6. PageRank

What it does: Estimates a current node’s importance from its linked neighbors and then again from their neighbors. A node’s rank is derived from the number and quality of its transitive links to estimate influence. Although popularized by Google, it’s widely recognized as a way of detecting influential nodes in any network.

How it’s used: PageRank is used in quite a few ways to estimate importance and influence. It’s used to suggest Twitter accounts to follow and for general sentiment analysis.

PageRank is also used in machine learning to identify the most influential features for extraction. In biology, it’s been used to identify which species extinctions within a food web would lead to biggest chain reaction of species death.

If you are interested in getting into graph databases, it’s useful to know these algorithms.

Comments closed

Demos Using Amazon QuickSight

Karthik Kumar Odapally and Pranabesh Mandal have several example visuals that you can generate using Amazon QuickSight:

Typical Amazon QuickSight workflow

When you create an analysis, the typical workflow is as follows:

  1. Connect to a data source, and then create a new dataset or choose an existing dataset.

  2. (Optional) If you created a new dataset, prepare the data (for example, by changing field names or data types).

  3. Create a new analysis.

  4. Add a visual to the analysis by choosing the fields to visualize. Choose a specific visual type, or use AutoGraph and let Amazon QuickSight choose the most appropriate visual type, based on the number and data types of the fields that you select.

  5. (Optional) Modify the visual to meet your requirements (for example, by adding a filter or changing the visual type).

  6. (Optional) Add more visuals to the analysis.

  7. (Optional) Add scenes to the default story to provide a narrative about some aspect of the analysis data.

  8. (Optional) Publish the analysis as a dashboard to share insights with other users.

It’s interesting to see how Amazon is trying to move this functionality from third-party tools (Power BI, Tableau, etc.) and notebooks right into the set of AWS offerings.  Contrast this with the way that Microsoft is building in Jupyter with Azure Notebooks.

Comments closed