Press "Enter" to skip to content

Author: Kevin Feasel

Clustered Indexes And Automatic Sorting

Kendra Little demonstrates that clustered indexes do not give us an automatic sorting of our data:

There is no “default” ordering that a query will fall back on outside of an ORDER BY clause.

  • Results may come back in the order of the clustered index, or they may not
  • Even if results come back in the order of the clustered index on one run of the query, they may not come back in the same order if you run it again

If you need results to come back in a specific order, you must be explicit about it in the ORDER BY clause of the query.

Click through for a demo proving these two points.

Comments closed

Recovering A Log-Shipped Database

Sander Stad wraps up his series on log shipping:

Initially log shipping was meant to be used as a warm standby. You have your data on another instance but you still have some human intervention to get it all back up.

Imagine the following situation. You have setup log shipping using either the GUI or the commands in dbatools. You have about 15 databases and everything is working fine.

Until one day the primary instance goes down and is not recoverable. For the production to continue you have to bring the log shipped databases online fast.

You have to figure what the last transaction log backup was. You have to check if it was copied to the secondary instance and if it’s restored.

To do this by running a couple of queries, copying the files if needed and run the log shipping jobs takes time. I’d rather run a command and recover one or more databases and get back to the problem of the primary instance.

Read on to see how to use dbatools to recover a log shipped database.

Comments closed

Building A Power BI Date Dimension

Martin Schoombee explains why you should have a date dimension in Power BI and then has a script which generates one:

If your source system does not contain a Date entity or dimension, a better way is to create a such an entity with Power Query (aka “M”). By creating your own, you can add more attributes than just year, quarter and month…and optimize the size and performance of your model at the same time.

Not a Power Query expert? You don’t have to be. Others in the community have already shared their versions, and I will share my own modified version that you can copy and use for yourself. I’ve used the examples of Matt Masson and Chris Webb as the basis of my own Date entity. You can find their versions here and here.

Read on for Martin’s version.

Comments closed

K-SSIS: Examining SSIS Packages

Ed Elliott has a new tool available:

The main part of the UI is a graphical map of the objects in a package. I have spent quite a lot of time trying to think of a better way of showing the elements that make up a package, and it is hard to argue with the workflow approach, all the ETL tools I found do the same thing and visualizing packages as a series of interconnected tasks is useful but maybe not everything and the only way.

Because of all these issues I have really tried to stay away from SSIS development and looked for other solutions such as using biml to write packages or writing code in other languages but I keep coming back to the fact that the SSIS engine is great at what it does and if you have a lot of data or a lot of files/tables then writing a similar framework would be a lot of wasted effort. So, how can we use the SSIS engine but make better, more efficient use of our development time?

Click through to see a couple of screenshots as well as a link to download K-SSIS.

Comments closed

Meltdown Performance Effects On Cassandra

The folks at Instaclustr have done some analysis on how Meltdown has affected Cassandra performance on AWS:

In our Security Advisory published 8 January, we advised of up to 20% increase in CPU utilization and small increase in latency across managed clusters in AWS and GCP following the rollout of the patches to the cloud provider hypervisors. We have since observed a reversal of this impact in the weeks following the initial announcements. That is, these effects disappeared when further AWS and GCP patches were rolled out by the cloud providers.

We assessed the risk of the vulnerabilities to our environment as Low. Our clusters run as single tenant and customer access is limited to the application layer.  If a user were able to exploit either of the vulnerabilities they could only gain access to their own information.

In short, they saw a change early on, but subsequent patching has removed that performance degradation.  Read the whole thing for more details.

Comments closed

Hortonworks DataFlow 3.1 Released

George Vetticaden and Haimo Liu announce Hortonworks DataFlow version 3.1:

Apache Kafka 1.0 support with full integration with HDF Services – Kafka 1.0 provides important new features including more stringent message processing semantics with support for message headers and transactions, performance improvements and advanced security options.

  • Apache Ambari support for Kafka 1.0 – Install, configure, manage, upgrade, monitor, and secure Kafka 1.0 clusters with Ambari.

  • Apache Ranger support for Kafka 1.0 – Manage access control policies (ACLs) using resource or tag-based security for Kafka 1.0 clusters.

  • New NiFi and SAM processors for Kafka 1.0 – New processors in NiFi and Hortonworks Streaming Analytics Manager (SAM) support Kafka 1.0 features including message headers and transactions.

Click through for the list of top changes.

Comments closed

Library Paths In R

Stacia Varga troubleshoots an issue integrating Power BI with R:

As I was putting together an example of using an R script as a Power BI data source, I ran into some issues on my development machine that was frankly driving me crazy. When I tried to run the query in Power BI with my R script (that ran successfully in the IDE, by the way), I kept getting this message:

DataSource.Error: ADO.NET: R script error.
Error in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]]) :

  namespace 'scales' 0.3.0 is being loaded, but >= 0.4.1 is required

Error: package or namespace load failed for 'rnoaa'

Execution halted

Stacia’s answer works as long as the .libPaths() results match expectations.  Another idea would be to set the R_LIBS_USER user-level environment variable to the desired starting directory and that should force the directory in the environment variable to be first when calling .libPaths().

Comments closed

Data Masking Prior To SQL Server 2016

Daniel Hutmacher shows how to roll your own data masking with SQL Server:

Dynamic data masking is a neat new feature in recent SQL Server versions that allows you to protect sensitive information from non-privileged users by masking it. But using a brute-force guessing attack, even a non-privileged user can guess the contents of a masked column. And if you’re on SQL Server 2014 or earlier, you won’t have the option of using data masking at all.

Read on to see how you can bypass dynamic data masking, and for an alternative approach that uses SQL Server column-level security instead.

Click through for the demo.

Comments closed

Bash For The Powershell-Minded

Mark Wilkinson has started a new series on Bash.  His first post is an introduction to the scripting language:

Bash (the Bourne Again Shell) was created in 1989 for the GNU Project as a free replacement for the Unix Bourne shell. Most modern Linux systems use Bash as their default command line shell, so if you have ever dropped to a command line on a Linux system, you have probably used Bash. Just like PowerShell, Bash is both a scripting language and a command shell/interpreter. So not only can you execute commands in an interactive shell session, but you can also write scripts that incorporate multiple commands.

Once you get your hands dirty with Bash you’ll notice a lot of features that were incorporated into PowerShell. Things like command substitution: $(Get-Date) were directly pulled from Bash $(date). Other features will look familiar as well, like the ability to pipe multiple commands together.

One thing you need to understand right away is that Bash is string based, not object based like PowerShell. This means you’ll find yourself doing a lot more string processing to get tasks done. Things like string splitting will be much more common. Bash does support objects, like arrays, but few if any commands output an array. As we go through this series you’ll see that this might not be as limiting as it sounds.

The best part about learning Bash is that you can then get into arguments about Bash vs ksh vs zsh.

Comments closed

Handling Permissions Changes With Powershell

Drew Furgiuele has a process to store and then re-run rights grants on SQL Server databases:

Permission requirements for these environments can change over time, just like the code and data going into your databases. It’s hard to track permissions because a database permission is much more than just a user principal; database objects often contain permission definitions for GRANT and DENY states, and users may belong in certain database roles in one environment, but not another. This isn’t a big deal… until it is: sooner or later your data and code drift will be different than production, or maybe some new change really breaks an environment. Then, you’ll be asked to restore these environments to either an earlier version, or, more likely, you’ll be asked to “refresh” these editions to what is currently in production.

You probably already have a process for this, but how are you handling maintaining differences in permissions between environments? Wouldn’t it be nice if you had a way to quickly evaluate, store, and then re-apply permissions as part of refresh? Even better, wouldn’t it be cool if you could do this for all your databases on a given instance? Or what about all your instances in a given environment?

You can, and you can do it pretty easily with PowerShell.

My one problem with Drew’s otherwise-excellent post is that he approved far too many entry visas in the opening GIF.  100% deny, 0 problems.

Comments closed