Press "Enter" to skip to content

Day: August 28, 2020

Spark SQL and Delta Lake on Spark 3.0

Denny Lee, et al, walk us through some improvements in Spark 3.0 around Spark SQL’s usage of Delta Lake:

One of most frequent questions through our Delta Lake Tech Talks was when would DML operations such as delete, update, and merge be available in Spark SQL?  Wait no more, these operations are now available in SQL!  Below are example of how you can write delete, update, and merge (insert, update, delete, and deduplication operations using Spark SQL

Read on for the full update.

Comments closed

Unusual Threadpool Waits

Josh Darnell explains why you might get threadpool waits even when you think you shouldn’t:

I occasionally see (usually brief) THREADPOOL waits on systems that are really not all that heavily loaded. This is my investigation into why. Some might say I have too much time on my hands.

Before getting into these unusual THREADPOOL cases, let’s cover the normal ones.

Read the whole thing. It’s example #9068 of how a particular wait is not always a bad thing.

Comments closed

Mutation Testing in Action

Nathan Thompson walks us through a mutation testing experiment:

Since our hypothesis was that the implementation differences between Jasmine and Jest could affect the Mutation Score of our legacy and new test suites, we began by cataloging every bit of Jasmine-specific syntax in our legacy suite. We then compiled a list of roughly forty test files that we would target for Mutation Testing in order to cover the full syntax catalog. For each file we generated a Mutation Score for its legacy state, converted it to run in our new Jest setup, and generated a Mutation Score again. Our hope was that the new Jest framework would have a Mutation Score as good as or better than our legacy framework.

By limiting the scope of our test to just a few dozen files, we were able to run all mutations Stryker had to offer within a reasonable timeframe. However, the sheer size of our codebase and the sprawling dependency trees in any given feature presented other challenges to this work. As I mentioned before, Stryker copies the source code to be mutated into separate sandbox directories. By default, it copies the entire project into each sandbox, but that was too much for Node.js to handle in our repository:

In my undergrad days, I loved mutation testing mostly because of the terminology. I’m happy to see a proper implementation of mutation testing and I’m even happier to see that they have a .NET version.

Comments closed

Checking that Power BI Security Roles are Correct

Fred Kaffenberger poses a question:

If you can ask, how do we know that we are improving, you should also be able to ask how do we know that the security roles are implemented correctly. Data culture is not just for the business, but for the reporting team as well. I haven’t seen much discussion of auditing security roles in Power BI circles, so I’m genuinely curious about how others tackle this issue. Does everyone simply work hard and hope for the best? Or do you restrict everything at the database level and use different apps for different groups instead? There may even be regulatory reasons which require you to restrict it at the database level. But even if you do restrict everything at the database level, you still need to validate that security as well.

Read on for a verification technique.

Comments closed

Passing Power Query Parameters to Stored Procedures

Soheil Bakhshi shows how we can take an input from Power Query and pass it to a stored porcedure:

This is the fourth one in the form of Quick Tips. Here is the scenario. One of my customers had a requirement to get data from a Stored Procedure from SQL Server. She required to pass the values from a Query Parameter back to SQL Server and get the results in Power BI.

The solution is somewhat easy.

If you’re familiar with SQL Server Reporting Services, the solution instantly makes sense.

Comments closed

Preventing Bruce Force Attacks in SQL Server

Raul Gonzalez walks us through some security tips and shows how to lock accounts after a certain number of failures:

SQL Server provides two different forms of authenticating the users that connect to the database server: Windows Authentication, which is the default and preferred method, and SQL Server Authentication, which needs to be explicitly enabled.

There are reasons you might need to enable SQL Server authentication and, although advertised as less secure than Windows Authentication, there are still a few things we can do to minimise the risks.

Read on for those tips.

Comments closed