This is the story of two products – or rather one product that is now a service and another product that is now a component of another product. A few years ago, Microsoft began to formulate a mobile usability story among many fragmented tools. They had a really good reporting product: SSRS, and they had a pretty good self-service BI capability offered as a bunch of Excel add-ins; namely: Power Pivot, Power Query and Power View – but it didn’t do mobile. They bought Datazen which was a decent mobile reporting and dashboard tool, designed primarily for IT developers and semi-tech-savvy business pros to quickly create mobile dashboards using traditional data sources. Datazen wasn’t really a self-service BI tool and wasn’t really designed to work with BI data in the true sense. It was a good power user report tool but was young and needed to be refined and matured as a product. Datazen became “Reporting Services Mobile Reports” and was integrated into the SSRS platform as a separate reporting experience with a separate design tool, optimized exclusively for use on mobile devices using platform-specific mobile phone and tablet apps. Since initial roll-out, product development stalled and has not changed at all since it was released with SQL Server 2016 Enterprise Edition.
Paul gives us his current advice, as well as a hint at where things could be going.
Unit testing helps us to write better code, make rapid changes to our code and has been generally seen as a good idea for about 10 years. Writing tests for T-SQL code is made much easier by using tSQLt but there is quite a high barrier to entry both in terms of the technical skills in getting tSQLt running and also how to approach large code bases of, sometimes, unfriendly T-SQL code and taming the code with unit tests.
I have successfully unit tested T-SQL code in a number of different environments including clean greenfield environments as well as legacy projects and I have written this course to help people get started with unit testing but also help them to turn unit testing into a part of their development process that they can use everyday to improve the quality of their work and the speed at which deployments can be made.
Click through to sign up.
Bert Wagner has a two-part series on SQL injection. In the first post, he shows how to use sp_executesql to parameterize queries:
The important thing to note in the query above is that we are generating a dynamic SQL statement; that is, we are building the SQL query string, and then we are executing it.
Imagine this stored procedure is running in order to display a “Welcome <Full Name>!” message in our app — a website visitor types in their
@ParmUserNameand we execute the stored procedure to return their full name.
In his second post, Bert shows what to do if you need to run a query off of a dynamically-selected table:
Unfortunately we have to fall back on SQL’s
However, like we discussed last week, we need to be vigilant about what kind of user input we allow to be built as part of our query.
Assuming our app layer is already sanitizing as much of the user input as possible, here are some precautions we can take on the SQL side of the equation:
Read on for more.
Once I understood the data well, I realized that the raw data had to be flattened out to expand collections (like KB) at the row level into their own row so that everything has a single value in each row. Then, the grouping is easy.
It made more sense to allow grouping not just by KB but by other columns like Product or CVE. The Group-Object works fine for most cases but since there will be duplicates after the data is grouped, it makes it easier to just do it with HashTables.
Jana provides the entire solution on his site. When reading it, I felt the urge to switch to a language which offers easier pivoting and aggregation, but the code was clear and understandable.
It seemed some of the rows in my CSV files exceeded an upper limit on how much the Extractor.Csv function can handle and adding the silent:true parameter didn’t solve the issue.
I dug a bit deeper and found rows in some of the files that are long – really long. One in particular was 47MB long just for the row and this was valid data. I could have manually edited these outs by hand but thought I’d see if I could solve another way.
After some internet research and a couple of helpful tweets to and from Michael Rys, I decided to have a go at making my own custom U-SQL extractor.
Phillip has included the custom extractor code, so if you find yourself needing to parse very large rows of data in U-SQL, you’ll definitely be interested in this.
That doesn’t work (apparently by choice!). And I find myself in the odd position of having to defend expecting
nrow()to return the number of rows.
There are a number of common legitimate uses of
nrow()in user code and package code including:
Checking if a table is empty.
Checking the relative sizes of tables to re-order or optimize complicated joins (something our join planner might add one day).
Confirming data size is the same as reported in other sources (
database, and so on).
Reporting amount of work performed or rows-per-second processed.
Read the whole thing; this seems unnecessarily complicated.
Although the Out-GridView cmdlet existed in PowerShell version 2.0, you’ll need PowerShell version 3.0 to use it as shown in this blog article. The OutputMode parameter of Out-GridView which is used in this blog article was added in PowerShell version 3.0. Also, Out-GridView can only be used on operating systems with a GUI (it cannot be used on server core).
As far as I know, there’s no way to set a default region in Azure like there is with the AWS Initialize-AWSDefaultConfiguration cmdlet. I guess if you really wanted to set a default, you could always use $PSDefaultParameterValues to set a default value for the Location parameter for the cmdlets that require it.
Out-GridView is great for what it is: an easy UI within Powershell, with sorting and filtering built in.
We can use minidumps to generate small files that contain SQL Server call stacks. Note that if you aren’t careful you can end up writing the full contents of memory to a dump file. I believe that writing the file is a single-threaded process, so this can take hours and hundreds of GBs on a real server. Also SQL Server is frozen while the dump is happening, so don’t do it in production.
Dumps are most useful when you want to examine a long running, very simple query or a query that seems to be stuck at a certain point. They aren’t useful to see a full call stack of a non-simple query or to see something that happens just once during execution, like a specific task during query compilation. All that you can get is a snapshot and the snapshot may not contain the information that you’re looking for.
There are multiple ways to generate and read dump files. For this blog post I’ll be using sqldumper.exe and WinDbg.
Click through to read the whole thing.
I have a series of blogs about temporal tables in SQL Server 2016, which you can see here. However, one thing I never had mentioned was that the Temporal Row Start and End Times are always stored in UTC times (which is generally inconvenient to work with unless you are from the Great Britain area). However, when you are looking at your data and want to see what the data looked like at ‘Yesterday at 12:00 at the computer you are working from, this can be a bit more work.
In 2016 SQL Server, a new feature was added that totally passed me by. AT TIME ZONE. It is used to manipulate the time zone offset of a point in time value,either to append the time zone to a value, or to shift the time zone value of a DATETIMEOFFSET value. The most common practical use that I will show is converting a value in your local time to UTC (and vice versa).
This is something I tend to forget about, but can be useful when building quick reports from UTC. I’d store all data in UTC, just to make sure dates and times are consistent across servers in different locations, but you don’t have to do those calculations in your head all the time.
We intended to fetch all customers and their addresses. But observe: We project only columns from the
CUSTOMERtable and we don’t have any predicates at all, specifically not predicates using the
ADDRESStable. So, we’re completely ignoring any contributions from the
ADDRESStable. We never really needed the JOIN in the first place!
And in fact, the optimiser can prove this too, because of the
FOREIGN KEYconstraint on
C.ADDRESS_ID, which guarantees that every
CUSTOMERrecord has exactly one corresponding
ADDRESSrecord. The JOIN does not duplicate, nor remove any
CUSTOMERrows, so it is unneeded and thus eliminated (by some, not all databases, will list each database at the end of the article).
So, the database can rewrite the SQL statement to the following, equivalent SQL statement in the presence of said FOREIGN KEY
Read on for a comparison across different products as well.