Press "Enter" to skip to content

Curated SQL Posts

Working With Temporal Line Charts In Power BI

Marco Russo shows off a few things you can do with Power BI to make displaying temporal data in line charts better:

The first line is related to the week ending on February 2nd, so Sales Amount includes only 2 days (February 1st and 2nd) excluding the amount of other 5 days in the same week (January 27th to 31st). The same happens in the last week, which includes June 29th and 30th but does not include sales for the remaining 5 days in the same week (July 1st to 5th). This also explains why the report includes a week ending in July 2008 even though the Month slicer only includes dates up to June 2018.

We can create a measure that removes incomplete weeks from the calculation, as shown in the following code. A similar technique could be used for incomplete months and quarters.

There are some interesting techniques that Marco shows off, including hiding incomplete weeks.

Comments closed

Configuring Snapshot Replication

Nisarg Upadhyay shows us how to configure snapshot replication:

On the next screen, configure the SQL Agent security. To configure the Agent security, click the Security Settings button. The Snapshot Agent Security dialog box opens. In the dialog box, provide the account under which the subscriber connects to the publisher. Moreover, provide the account information under which the SQL Server agent job will be executed. For this demo, SQL Server jobs are executed under the SQL server agent service account, hence select the Run under the SQL Server Agent service account option. Subscribers will be connected to the publisher using SQL login, hence select the Using the following SQL Server login option and provide SQL login and password. In this demo, connect using the sa login. Click OK to close the dialog box and Click Next.

Snapshot replication is the easiest to get right, but most of the setup is the same for transactional or merge replication.

Comments closed

Operating Management Studio With Multiple Active Directory Accounts

Kenneth Fisher shows how to use different Active Directory credentials when using SQL Server Management Studio:

To help promote the seperation of duties one of the things my company has done is to divide our permissions into two accounts. We have one account that is for our daily tasks. Reading email, searching the internet, basic structure changes in a database etc. The other account is our admin account. It’s for remoting to servers, security tasks, really anything that requires sysadmin. I’m not going to argue the advisability of this because honestly, I’m kind of on the fence. That said, I do have to deal with it and there are a few tips in case you have to deal with it as well.

And if you’re not on the domain as well, runas /netonly /user:[domain\username] ssms.exe will do the job.

Comments closed

Deploying Cloudera Enterprise On Azure

Xavier Morera announces a new Cloudera course:

You will start by learning the Microsoft Azure services required to deploy a secure, elastic, Cloudera Enterprise cluster. These core services include security, networking, virtual machine management, and storage, just to name a few.

Then, you’ll learn best practices and patterns for cloud-based clusters, including tips and caveats for security and workload management.

Next, you’ll learn how to bootstrap a cluster using Cloudera Manager, which allows you to deploy a cluster on premises or in the cloud. The module covers how to deploy both development (Path A) and production-grade (Path B) clusters.

This is a free course, so if you’re looking for a way to fill your Thanksgiving weekend, this is definitely an option.

Comments closed

Python In SQL Server Reporting Services

Tomaz Kastrun shows how we can visualize results from Python models in SQL Server Reporting Services:

As we have created four different models, we would also like to have the accuary of the model visually represented using SSRS.

Showing plots created with Python might not be as straight forward, as with R Language.

Following procedure will extract the data from database and generate plot, that can be used and visualized in SSRS.

Tomaz shows us examples of displaying data as well as visuals generated in Python.

Comments closed

Running Python-Based ML Tasks In Excel

Tony Roberts shows off some of the functionality of PyXLL:

Once we’ve done the hard work of building and testing a model we need to put it to some use! Excel is a great front-end tool for playing with data interactively. It’s used virtually everywhere and so being able to deliver your model in Excel to non-developer users massively opens up opportunities for how it can be used in your business. Even if the model is being used as part of a real-time or batch system, being able to call the model interactively can be really helpful when trying to understand the behaviour of a system.

Fortunately now the model is written in Python getting it into Excel is extremely simple. PyXLL, the Python Excel Add-In has everything we need to write Python for Excel. All we need to do is add a few @xl_func decorators from the pyxll module and configure the PyXLL add-in to load the module containing our model.

If you’re not already familiar with PyXLL, check out the introduction to PyXLL from the user guide.

I mean, if the data’s going to live in Excel spreadsheets anyhow…

Comments closed

More Tabular Best Practices

Ginger Grant continues her series on Analysis Services Tabular best practices:

Optimize your DAX Code

While it is not easy to performance tune DAX you can do it, by evaluating the DAX Query Plan and VeritPaq Queries, and SQLBI’s VertiPaq Analyzer. Also, you can also look to use functions which perform better, for example COUNTROWS instead of DISTINCTCOUNT or ADDCOLUMNS instead of SUMMARIZE. Whenever possible use the CALCULATE function instead of the FILTER function, as CALCULATE filters for context inside the parenthesis and are more efficient. Also all of the iterative functions SUMX, COUNTX etc., should be used sparingly as the row-by-row transactions they create are less efficient and should be used only when SUM or COUNT will not work.  When evaluating if a value missing, if it is possible, use ISEMPTY instead of ISBLANK as ISEMPTY looks only for the presence of a row, which is faster than the evaluation performed by ISBLANK.

Read on for several more items in this vein.

Comments closed

What’s New With In-Memory OLTP In SQL Server 2019

Ned Otter gives us two things to look forward to with SQL Server 2019:

So far, there’s been only one publicly announced enhancement for In-Memory OLTP in SQL 2019: system tables in TempDB will be “Hekatonized”. This will forever solve the issue of system table contention in TempDB, which is a fantastic use of Hekaton. I’m told it will be “opt in”, so you can use this enhancement if you want to, but you can also back out of it, which would require a restart of the SQL Server service.

But there’s at least one other enhancement that’s not been announced, although the details of its implementation are not yet known.

Read on to learn about this not-yet-announced update.

Comments closed

Deep Dive On Index Spools

Hugo Kornelis takes a look at index spools:

The Index Spool operator is one of the four spool operators that SQL Server supports. It retains a copy of all data it reads in an indexed worktable (in tempdb), and can then later return subsets of these rows without having to call its child operators to produce them again.

The Index Spool operator is quite similar to Table Spool, except that Index Spool indexes its data, giving it the option to return a subset, and Index Spool lacks the option to read data from a spool created by another operator. The other two spool operators are quite different: Row Count Spool is optimized for specific cases where the rows to be returned are empty, and Window Spool is used to support the ROWS and RANGE specifications of windowing functions.

Eager and Lazy Spool operators rank high on my list of “troublesome when I see them” operators.  The reason is not so much that eager or lazy spools are inherently bad—they’re not, as they are efficient ways to perform a particular query given the constraints of that query—but if I see one of them in conjunction with a slowly-performing query, it’s a good sign that I want to optimize away the need for spooling.

Comments closed