Curated SQL – Page 841 – A Fine Slice Of SQL Server

Well, this seems to be good news for the sales team: rising sales! Yet, how does this model arrive at those numbers? To understand what is going on we will now rebuild the model. Basically, everything is in the name already: auto-regressive, i.e. a (linear) regression on (a delayed copy of) itself (auto from Ancient Greek self)!
So, what we are going to do is create a delayed copy of the time series and run a linear regression on it. We will use the lm() function from base R for that (see also Learning Data Science: Modelling Basics).

Read on for some additional understanding.

Comments closed

Tips for Debugging in Visual Studio

Published 2020-07-01 by Kevin Feasel

Patrick Smacchia has 12 tips for debugging in Visual Studio:

4) Data breakpoint: Break when value changes
If you set a breakpoint to a non-static property setter it will be hit when changing the property value for all objects. The same behavior can be obtained for a single object thanks to the Locals (or Watch) window right click : Break When Value Changes menu.
This facility is illustrated with the animation above. The hit occurs only when obj2.Prop is changed, not when obj1.Prop is changed.

These go a step beyond the basics, so check them out.

Comments closed

PRINTing More Than 8000 Bytes

Published 2020-07-01 by Kevin Feasel

Richard Swinbank hits on a bugbear of mine:

A feature of T-SQL is that strings longer than 8000 bytes are truncated by PRINT. If you haven’t already discovered this, you might wonder why it’s a problem – the answer (for me at least) is dynamic SQL.

Read on for Richard’s answer. There is an easier way to do it with the paid version of SQL# by using the Util_Print function.

Comments closed

Configuring Power BI Incremental Refresh

Published 2020-07-01 by Kevin Feasel

Gilbert Quevauvilliers has a follow-up from a post:

Following on from my successful blog post How you can incrementally refresh any Power BI data source (This example is a CSV File), I found a way where I can just use dates created in Power Query to get data refreshing incrementally.
Full credit goes to Rafael Mendonç who actually figured this out. All that I have done is to translate what Rafael Mendonça did in his PBIX and put it into steps that you can follow along with.
https://www.rafaelmendonca.com/2020/06/incremental-powerbi-csv-api-excel-odbc.html
In this blog post I am going to demonstrate how to get this working with what I hope is very easy to follow.

Read on for the process.

Comments closed

Truncating All Tables in a Database with Powershell

Published 2020-07-01 by Kevin Feasel

Jess Pomfret nukes the database from orbit, as it’s the only way we can be sure:

The most popular post on my blog so far was called ‘Disable all Triggers on a Database’ and this one is a good follow up from that post.
The scenario here is you need to remove all the data from the tables in your database. This could be as part of a refresh process, or perhaps to clear out test data that has been entered through an application. Either way, you want to truncate all the tables in your database.

Click through for the code.

Comments closed

Keep Parameter Sniffing On

Published 2020-07-01 by Kevin Feasel

Brent Ozar explains why you should keep parameter sniffing on:

What they THINK is going to happen is that SQL Server will do an OPTION(RECOMPILE) on every incoming query, building fresh plans each time. That ain’t how this works at all, and instead, I wish this “feature”‘s name was “Parameter Blindfolding.” Here’s what it really does.

Read on for the explanation. In reality, parameter sniffing is almost always a good thing. It’s when you have major skews in data that you even have to think about parameter sniffing being a problem.

Comments closed

Tips for Improving Power BI Dashboards

Published 2020-07-01 by Kevin Feasel

Tino Zishiri has a set of tips to design better-looking dashboards:

There are several reasons why you should design great looking dashboards. Here are a few;
– They make information more accessible – end users benefit from an intuitive design that makes insight easy to obtain so they can make informed decisions.
– They help convey your message – you’re in a better position to tell a coherent story. Applying design principles can also help accentuate your message. My colleague Kalina Ivanova has written an excellent series of blogs on Data Storytelling with Power BI.
– They encourage user adoption – if a report is useful to users and has a great look and feel then you’re winning.
In this blog, I’ll briefly cover the building blocks that make up a good Power BI dashboard. I then explore the stepping stones that will level up your dashboard and take it from good to great.

One area where I do have some disagreement is that the Z and F layouts are fine for text-heavy formats, but generally “text-heavy” and “dashboard” don’t go together very well. My preference is the notion of focal points (go about 3/4 of the way down, to the section entitled “Where We Look”), which works much better at describing eye behavior for image-heavy layouts. That aside, I like this post a lot.

Comments closed

Using Flink in Zeppelin Notebooks

Published 2020-06-30 by Kevin Feasel

Jeff Zhang continues a series on using Apache Flink in Zeppelin Notebooks:

With Zeppelin, you can build a real time streaming dashboard without writing any line of javascript/html/css code.
Overall, Zeppelin supports 3 kinds of streaming data analytics:
– Single Mode
– Update Mode
– Append Mode

Read on for examples of each of these, as well as a few tips around user-defined functions.

Comments closed

Cost-Cutting in Confluent Platform

Published 2020-06-30 by Kevin Feasel

Nick Bryan shares some techniques for reducing the cost of running on Confluent Platform:

To start, there are several Confluent Platform features that can greatly reduce your Kafka cluster’s infrastructure footprint. For use cases involving high data ingestion rates, lengthy data retention periods, or stringent disaster recovery requirements, Confluent Platform can help to reduce infrastructure costs by up to 50%.
One of the most important features for this cost category is Tiered Storage.

Read on for a few tips.

Comments closed

Improvements in Spark 3.0

Published 2020-06-30 by Kevin Feasel

Alex Woodie covers some of the improvements available to us in Apache Spark 3.0:

The addition of join hints further enhances the accuracy of the compiler when the built-in algorithms deliver a suboptimal plan. “When the compiler is unable to make the best choice, users can use join hints to influence the optimizer to choose a better plan.”

Ah, join hints—the double-edged sword.

Comments closed

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31

Curated SQL Posts