Press "Enter" to skip to content

Curated SQL Posts

Higher-Order Functions In R

Holger von Jouanne-Diedrich explains the concept of higher-order functions using R as an example:

The part that causes the biggest difficulties (especially for beginners of R) is that you state the name of the function at the beginning and use the assignment operator – as if functions were like any other data type, like vectors, matrices or data frames…

Congratulations! You just encountered one of the big ideas of functional programming: functions are indeed like any other data type, they are not special – or in programming lingo, functions are first-class members.

This is one of the core tenets of functional programming: functions are things you can pass around to other functions; they aren’t special, inviolate pieces of code but are just another thing. Click through for a couple good examples of what you get in a language which supports higher-order functions.

Comments closed

Thoughts On The Year’s Big Data Platform News

Kevin Chant shares some thoughts on some of the biggest news stories of 2018 for data platform professionals:

Hortonworks and Cloudera announcement about their merger is certainly an interesting for the Big Data landscape. These two are thought to be the leaders in the Hadoop industry.
Undeniably, a lot of people have seen what these two Big Data giants have delivered over the years within the Hadoop ecosystem.
With this merger they are aiming to use their combined expertise to deliver an enterprise data cloud. We’ve already seen what Hadoop based cloud offerings like HDInsight are capable of, so the potential here is huge.
Certainly, there’s potential for this to have massive implications in the Big Data industry. And this merger could also encourage even more Data Platform offerings to emerge.

Read on for Kevin’s thoughts on five major stories this year.

Comments closed

Getting The Latest File With Power Query

Matt Allington shows us one technique to get the latest version of a file using Power Query:

This pattern is common if your new file contains a superset of all the data.  It could be a transactional file that grows in length each time or it could be a dimension/lookup table (such as Customers) that can change slowly over time, and you always want to see the latest version.  My advice to all my Power Query students is “zero touch the file”.  In other words, your objective should always be to have the absolute minimum amount of interaction with the source files possible and push all the work into Power Query.  This will minimise the amount of work/rework you have to do in the future.  Thinking about the use case here “load the latest version of a file”, the question becomes “how can I make this zero touch”?  There are a few issues to consider including naming/renaming of the file and also archiving old copies of the file.  This doesn’t sound like zero touch to me.

Read the whole thing and check out the video.

Comments closed

What Happens With Multiple Missing Indexes

Arthur Daniels shows us what happens when there are multiple missing indexes in an execution plan:

This is missing index request #1, and by default, this is the only missing index we’ll see by looking at the graphical execution plan. There’s actually a missing index request #2, which we can find in the XML (I know, it’s a little ugly to read. Bear with me).

I am of two minds on this. It probably should be easier to see multiple index candidates, but there’s already so much risk of people just copy-pastaing missing index recommendations that adding more seems like a bad idea.

Comments closed

Conditional Formatting In Power BI

Teo Lachev walks us through conditional formatting in Power BI:

If you have used SSRS, you know that paginated reports are very customizable, and you can make almost any property expression-based. Power BI is yet to deliver expression-based properties to change settings based on runtime condition, such as to change the font style based on the actual value or user selection. Currently, there are two places where you can use a DAX measure for expression-based formatting of colors:

Click through to see those two places. I think I’d like to see conditional formatting be a bit easier than that.

Comments closed

Calculating Net Present Value And Internal Rate Of Return With DAX

Annie Xu walks us through a couple of financial calculations and how to implement them in DAX:

The Excel XNPV function is a financial function that calculates the net present value (NPV) of an investment using a discount rate and a series of cash flows that occur at irregular intervals. Calculate net present value for irregular cash flows. Net present value. =XNPV (rate, values, dates)

The Excek XIRR(Internal Rate of Return) is the discount rate which sets the Net Present Value (XNPV) of all future cash flow of an investment to zero.  If the NPV of an investment is zero it doesn’t mean it’s a good or bad investment, it just means you will earn the IRR (discount rate) as your rate of return. =XIRR(values,dates,guess)

Click through to see how to do this in DAX, especially if your data is not in exactly the right format.

Comments closed

Wrapping Up 12 Days Of dbatools

Garry Bargsley has gone through twelve days of the dbatools module’s functionality:

The final day is upon us and I have saved the best for last and one you can take in to the holiday season as a present from our fearless leader of dbatools.  The Start-DbaMigration is where the dbatools module started years ago.  Who of us have not said “I really should automate this tedious SQL migration stuff” so I do not forget a step when I have not had enough coffee.  Well Chrissy LeMaire (b | t) not only said it, but made it reality.  This early work has grown in to the dbatools module that we know today and has grown in to a multi-tool toolkit that help the DBA expand their skills without having the burden of learning a scripting language and the complexity that goes with hooking it to your SQL Server.

This tool has literally changed my life since finding it a couple years ago.  Not only am I doing work in PowerShell, I am automating work processes.  Also, this has contributed to my personal growth.  Chrissy and team has such a welcoming atmosphere and inclusion around the project, it has guided me to contribute to an open source project as well as guided me to presenting at SQL Saturdays events.

Read the whole thing. And if you missed parts of the series, they’re all up on Garry’s blog.

Comments closed

What’s New With Kafka 2.1

Stephane Maarek updates us on the goings-on with Apache Kafka:

Kafka 2.1 is quite a special upgrade because you cannot downgrade due to a schema change in the consumer offsets topics. Otherwise the procedure to upgrade Kafka is still the same as before, see: https://kafka.apache.org/documentation/#upgrade

One of the big changes is support for Java 11. It’s a shame that Spark currently doesn’t support versions past 8.

Comments closed

Static Data Masking In SSMS 18.0

Monica Rathbun introduces a new feature in SQL Server Management Studio:

Ever need to have a test database on hand that you can allow others to query “real like” data without actually giving them actual production data values? In SQL Server Management Studio (SSMS) 18.0 preview Microsoft introduces us to Static Data Masking
. Static Data Masking is a new feature that allows you to create a cloned copy of your database and replace sensitive data with new data (fake data, referred to as masked). You can use this for things like development of business reports and analytics, trouble shooting, database development and even sharing data with outside teams or third parties. Unlike Dynamic Data Masking
 added in SQL Server 2016, this feature does not hide the data with characters, rather it replaces the entire value.  For example with dynamic data masking the name Peter = Pxxxx, whereas Static Data Masking changes Peter to Paul.  This makes it very easy to use in place of production. Let’s see it in action. If you are not on a newer version on SSMS, don’t worry, you can download it

It looks like there are a few limitations to keep in mind, so click through to read about those.

Comments closed

SSMS Keyboard Shortcuts In Azure Data Studio

Bob Pusateri reduces a bit of the mental burden of shifting to Azure Data Studio:

One thing about Azure Data Studio I’m not too keen about, though, is that many of the keyboard shortcuts are different. One keyboard shortcut that’s particularly helpful to me is using Ctrl + E to execute queries. I realize that F5 is the most common key to execute a query, however on most laptop keyboards you now need to hold an additional key to make the function keys behave like function keys. For this reason, Ctrl+ E is a wonderful and quick alternative, but it doesn’t work in Azure Data Studio. Or didn’t, until now.
Fortunately, Azure Data Studio is designed to be expanded upon with extensions from both Microsoft and the community. In the case of keyboard shortcuts, a particularly helpful one is called SSMS Keymap, which ports many popular SSMS keyboard shortcuts into Azure Data Studio. With this extension,  Ctrl + E is once again an option, and I no longer have to click “Execute” with a mouse, or fumble to find my laptop’s F5 equivalent.

Click through for the demo and grab that extension.

Comments closed