Press "Enter" to skip to content

Curated SQL Posts

Combining SAS + R for ML

Sophia Rowland has a demo of working with SAS Cloud Analytic Service from R:

The Scripting Wrapper for Analytics Transfer, also known as SWAT, is a package that allows R users to access the power of the SAS Cloud Analytic Service (CAS) from a familiar R interface. The SWAT package is available to SAS Visual Analytics (VA), SAS Visual Statistics (VS), and SAS Visual Data Mining and Machine Learning (VDMML) users. To begin working with SWAT, download and install the package from the SAS Software GitHub page.

Read on for the demo.

Leave a Comment

Online and Resumable Operations in SQL Server

Kendra Little summarizes which operations in SQL Server have the ability to be run online, which are resumable, and which support the WAIT_AT_LOW_PRIORITY flag:

ONLINE operations in SQL Server were simple to understand for years — we got ONLINE index rebuilds in SQL Server 2005. That was it for a while. Then, things got more complicated: we got more types of indexes. We got ONLINE options for schema changes that don’t involve indexes. We got more options for managing things like blocking, because online operations are really only mostly online — generally there’s going to be at least a short period where an exclusive lock is needed to update metadata. We now have some RESUMABLE operations coming in, too, for those big operations that are tough to handle.

Along the way, I fell behind. Because these features have steadily come out over a period of time, my brain simply didn’t register them all, or possibly I missed seeing them amid other announcements.

It’s not a comprehensive list, but it’s a good starting point for understanding the options you have available.

Leave a Comment

Checklist for an Azure VM Running SSRS

Kathi Kellenberger has a troubleshooting guide for setting up an Azure virtual machine to run SQL Server Reporting Services:

Recently I set up an Azure VM running SSRS for my students to host their database and report projects. My goal was to set up a custom domain name and use SSL. I ran into a few issues and a couple of wrong paths so I thought this would be helpful for others trying to accomplish the same. I’m not going to do a step-by-step walkthrough, but instead a checklist to help you troubleshoot

It’s not step-by-step, but it’s pretty close.

Leave a Comment

Creating a Gen-2 Azure Data Lake Store

Cecilia Brusatori shares how to build a generation-2 data lake in Azure:

Finally, you’ve decided that Data Lake Gen 2 is good for your Data Analytics Scenario and you’ve started the journey, went to the Azure Portal and searched for it. Mhh you don’t see it in the options to create it, let’s try the search bar [typing Data Lake Gen2….] Nothing… Ok maybe you’ve missed something…. nope!
So what is in fact a Data Lake Gen 2? it is a blob storage account, optimized for Data Analytics.
Let’s take a look at how you are able to create it!

If you’re used to the first generation, where Azure Data Lake Storage was its own thing, it might take a minute to realize where it went.

Leave a Comment

Designing a Layered Donut Chart in Power BI

Prathy Kamasani shares how to build a layered donut chart in Power BI:

You can view the full report here – https://prathy.com/portfolio/school-uk/

The tricky bit of any good data visualisation is finding interesting data, inspiration and story. In my report, my story was to show the comparison of Ofsted school ratings among the total number of schools. I also wanted to carry my story between report pages with colours.

So in this particular visual, I wanted to make it visually appealing, so I decided to show two metrics on each donut visual, the measure I want and the total number of schools. Then I resized each visual in a way so that they look like one visual. Those days there were no grouping, but now we can group them as one visual too. Also, when I first designed this report, default Power BI donut chart didn’t have an option to resize the ring, so I went for Circular gauge by MAQ software. Now we can use the default Donut chart too.

Click through for more details.

Leave a Comment

PARALLEL_REDO_FLOW_CONTROL Waits on Availability Groups

Taryn Pratt goes through a short outage at Stack Overflow:

While I can’t be 100% sure of the trigger, I’m 99.9% sure, because the job was running before the outage, so the timing is right. After looking through our monitoring logs, everything pointed to the job being the cause, so yes, I’m confident it caused it.

We don’t have regular maintenance windows for any of our servers, so we run jobs throughout the week, and if possible, try to schedule them during low-usage times. In this case, the job was an index maintenance job.

Now, before you scream at me about running an index maintenance job, I’m not going to argue the pros and cons of using it or whether or not we should run it — we can do that at another time. For this post, just accept the fact that we were running a job to rebuild/reorganize indexes

This is an interesting after-analysis of an outage. I have a lot of respect for people who can put these together and make them public—I would have a lot of trouble doing that myself.

Leave a Comment

Azure Data Factory Data Flows

Cathrine Wilhelmsen continues a series on Azure Data Factory:

So far in this Azure Data Factory series, we have looked at copying data. We have created pipelinescopy data activitiesdatasets, and linked services. In this post, we will peek at the second part of the data integration story: using data flows for transforming data.

But first, I need to make a confession. And it’s slightly embarrassing…

I don’t use data flows enough to keep up with all the changes and new features

To be fair to Cathrine, this is a rapidly-changing part of ADF.

Leave a Comment

Converting Databricks Notebooks to ipynb

Dave Wentzel shows how we can convert a Databricks notebook (in DBC format) to a normal Jupyter notebook (in ipynb format):

Databricks natively stores it’s notebook files by default as DBC files, a closed, binary format. A .dbc file has a nice benefit of being self-contained. One dbc file can consist of an entire folder of notebooks and supporting files. But other than that, dbc files are frankly obnoxious.

Read on to see how to convert between these two formats.

Leave a Comment

Web Scraping with F#

Jamie Dixon walks us through scraping a webpage using F#:

I need to go through all 8 pages of the grid and download the .pdfs that are associated with the “View Report” link. The challenge in this particular site is that they didn’t do any url parameters so there is no way to go through the grid via the uri. Looking at the page source, they are using ASP.NET and in typical enterprise-derpy manner, named their table “GridView1”

The way to get to the next page is to press on the “Next” link defined like this:

They over-achieved in the bloated View State for a simple page category though.

#Sigh

The code is straightforward and available as a Gist in the post.

Leave a Comment

Time Series Anomaly Detection with Power BI

Leila Etaati takes us through time series anomaly detection with Cognitive Services and Power Query:

I am excited about this blog post, this is based on the New service in Cognitive Service name “Anomaly Detection” which is now in Preview.
I recorded a video about how it works in cognitive service https://youtu.be/7ZOtZDbn6gM. 

However, I am going to talk about how to use it in Power BI. In this post first, a brief introduction to the anomaly detection will be presented, then how it can be used inside Power BI will be discussed.

It sounds like there are still some rough edges, but they already have the makings of an interesting service.

Leave a Comment