Press "Enter" to skip to content

Author: Kevin Feasel

Another Reason To Avoid Shrinking Data Files

Frank Gill gives us a demo of how much log space it takes to shrink a database file:

Yesterday, I was running a health assessment for a client. They are running a weekly maintenance plan that is shrinking all of their data files. After I picked myself up off the floor, I searched the web for “Paul Randal shrink” and hit on Paul’s excellent post Why you should not shrink your data files. In the post, Paul (b|t) demonstrates the effect of DBCC SHRINKDATABASE on index fragmentation. After the demo script, Paul writes, “As well as introducing index fragmentation, data file shrink also generates a lot of I/O, uses a lot of CPU and generates *loads* (emphasis Paul’s) of transaction log.”

This led me to ask the question, “How much is *loads*?”. To find an answer, I made the following modification to Paul’s script:

Read on for the answer.  There are legitimate reasons to shrink data files, but it comes at a very high cost.

Comments closed

New Extended Events In SQL Server 2017

Erik Darling unwraps some of his Christmas presents a bit early:

There are, as of RC2 being released, 194(!) new Events to Extend your mind with. Not all of them are interesting to me, and I haven’t had time to pry into all of the ones that are interesting just yet.

This is a rundown of the new Events with names or descriptions that I found interesting, and will try to spend some time with.

I can’t promise anything

After all, getting some of these to fire is tougher than using a Debugger.

There are some interesting events here.

Comments closed

Building Azure Resource Manager Templates

Ed Elliott gives a brief overview of Azure Resource Manager templates and puts together a sample template:

The ARM API deploys resources to Azure, but doesn’t deploy code onto those resources. For example you can use ARM to deploy a virtual machine with SQL Server already installed but you can’t use ARM to deploy a database from an SSDT DacPac.

To save time when designing solutions, it is important to understand that ARM API is used simply for resources and we need to use some other technology such as DSC or PowerShell to manage the deployments onto the infrastructure once it is deployed.

This is a nice overview of the topic, and because it’s Ed (who is much better about this than most), he goes into how to test before even getting into how to create.

Comments closed

TensorFlow On The Pi

Pete Warden shows how to install TensorFlow on a Raspberry Pi:

It’s never been easy to get TensorFlow installed on a Pi though. I had created a makefile script that let you build the C++ part from scratch, but it took several hours to complete and didn’t support Python. Sam Abrahams, an external contributor, did an amazing job maintaining a Python pip wheel for major releases, but building it required you to add swap space on a USB device for your Pi, and took even longer to compile than the makefile approach. Snips managed to get TensorFlow cross-compiling for Rust, but it wasn’t clear how to apply this to other languages.

Plenty of people on the team are Pi enthusiasts, and happily Eugene Brevdo dived in to investigate how we could improve the situation. We knew we wanted to have something that could be run as part of TensorFlow’s Jenkins continuous integration system, which meant building a completely automatic solution that would run with no user intervention. Since having a Pi plugged into a machine to run something like the makefile build would be hard to maintain, we did try using a hosted server from Mythic Beasts. Eugene got the makefile built going after a few hiccups, but the Python version required more RAM than was available, and we couldn’t plug in a USB drive remotely!

Read the whole thing, even if for the science experiment aspect.

Comments closed

Compacting Shared Libraries In R

Dirk Eddelbuettel compacts the tidyverse:

Of course, there is a third way: just run strip --strip-debug over all the shared libraries after the build. As the path is standardized, and the shell does proper globbing, we can just do

$ strip --strip-debug /usr/local/lib/R/site-library/*/libs/*.so

using a double-wildcard to get all packages (in that R package directory) and all their shared libraries. Users on macOS probably want .dylibon the end, users on Windows want another computer as usual (just kidding: use .dll). Either may have to adjust the path which is left as an exercise to the reader.

When running this against the tidyverse library, shared library sizes dropped from 78 MB down to 6.6 MB.  Not bad for a single command. H/T R-Bloggers

Comments closed

Cross-Container Transactions With Memory-Optimized Objects

Ned Otter continues his series on In-Memory OLTP isolation levels:

Why will it fail?

It will fail because the initiation mode of this transaction is not autocommit, which is required for READ COMMITED SNAPSHOT when referencing memory-optimized tables (the initiation mode is explicit, because we explicitly defined a transaction).  So to be totally clear, for queries that only reference memory-optimized tables, we can use the READ COMMITTED or READ COMMITTED SNAPSHOT isolation levels, but the transaction initiation mode must be autocommit. Keep this in mind, because in a moment, you’ll be questioning that statement….

There are some interesting implications that Ned teases out, so I recommend giving it a careful read.

Comments closed

A dplyr Quiz

John Mount wants to know how well you understand dplyr:

dplyr is one of the most popular R packages. It is powerful and important. But is it in fact easily comprehensible?

dplyr makes sense to those of us who use it a lot. And we can teach part time R users a lot of the common good use patterns.

But, is it an easy task to study and characterize dplyr itself?

Take John’s quiz and find out.  He wasn’t kidding about it being an advanced quiz.

Comments closed

Basics Of Dplyr

Dave Mason is dipping his toes into the R waters:

I think my first exposure to R was at PASS Summit 2016. Since then, I’ve made an effort to attend R sessions at SQL Saturdays. The one commonality I seem to find in all of them is a demo with (or mention of) the dplyr package. It’s a package of functions that manipulate data in data frame objects (think of them as SQL Server/relational tables…or if you’re a .NET developer, a System.Data.DataTable object). R feels inexorably tied to dplyr at this early stage for me. R is probably way more vast than I realize, but what would it be without dplyr? Would it still be as popular? Would it still be as powerful?

What’s It Good For

I’m not sure if I’m perceiving this the right way yet, but dplyr sure feels a lot like LINQ, a .NET Framework technology that provides query-like capability for C#. For instance, you can select a subset of objects from an array, sort them, find a minimum or maximum, etc. It’s kind of like querying SQL Server, just without SQL Server.

I like the comparison of dplyr against LINQ, as they’re both data querying and transformation tools whose motif is a series of functions chained together.

Comments closed

Powershell: Text Search In A Table Value

Shane O’Neill clarifies a misunderstanding in Powershell:

and you are running the following PowerShell command to check if the results contain a value…

1
2
3
$String = "abc"
$Array = @(Invoke-Sqlcmd -ServerInstance "SQLServer" -Database "Database" -Query "SELECT code FROM dbo.users")
$Array.Contains($string)

It will return FALSE.

Now we know that the FALSE is false because we know that the string is in there!
This code is proven to work with arrays as stated here by the “Hey, Scripting Guy!”s so this was getting filed under “WTF PowerShell”

It’s a good post, so check it out and remember that DataRows aren’t strings.

Comments closed