Press "Enter" to skip to content

Author: Kevin Feasel

R Or Python

Tomaz Kastrun shares his thoughts on the topic of R versus Python:

Imag[in]e I ask you, would you prefer Apple iPhone over Samsung Galaxy, respectively? Or if I would ask you, would you prefer BMW over Audi, respectively? In all the cases, both phones or both cars will get the job done. So will Python or R, R or Python. So instead of asking which one I prefer, ask your self, which one suits my environment better? If your background is more statistics and less programming, take R, if you are more into programming and less into statistics, take Python; in both cases you will have faster time to accomplish results with your preferred language. If you ask me, can I do gradient boosting or ANOVA or MDS in Python or in R, the answer will be yes, you can do both in any of the languages.

This graf hits the crux of my opinion on the topic, but as I’ve gone deeper into the topic over the past year, I think the correct answer is probably “both” for a mature organization and “pick the one which suits you better” for beginners.

Comments closed

More SSMS Tips And Tricks

Wayne Sheffield has another batch of SSMS tips and tricks for us.  First, he provides some helpful hints with comments.  Then comes a useful addition to SSMS 2016, comparing query plans:

Notice that various options have a colored non-equals icon. Here you can quickly see the various values that are different between the two execution plans.

At the bottom of the execution plans is a Showplan Analysis window. This window has color-coded keys for various sections of the plan:

He also shows how to import and export your SSMS configuration settings.  This makes it easier to migrate to a different machine or keep your desktop and laptop looking the same.

Comments closed

Connection Pooling And Slow Leaks

Warren Estes explains how connection pools work and troubleshoots a connection pooling issue:

When an application connects to a database it takes resources to establish that connection. So rather than doing this over and over again a connection pool is established to handle this functionality and cache connections. There are several issues that can arise if either the pool is not created with the same connection string (fragmentation), or if the connections are simply not closed/disposed of properly.

In the case of fragmentation, each connection string associated with a connection is considered part of 1 connection pool. If you create 2 connection strings with different database names, maxpool values, timeouts, or security then you will in effect create different connection pools. This is much like how query plans get stored in the plan cache. Different white space, capital letters all create different plans.

You can get the .NET pool counts from:
Performance Monitor> .NET data provider for SQL Server > NumberOfActiveConnectionPools

Click through for more information.

Comments closed

Finding The Last Known Good CHECKDB Run

Amy Herold shows how to find the last known CHECKDB run for each database on a SQL Server instance:

Wednesday I walk into the office and immediately hear that CHECKDB is the source of issues on one of the servers and is the reason behind some errors that have been happening. While I don’t think this is the case (it might look like it on the surface but there is something else that is happening that is the actual cause) I also wanted to find out what CHECKDB was running at the time the errors occurred.

I needed information on when CHECKDB ran for each database. When you look for what you can run to find when CHECKDB was last run you find this blog post and also this blog post on grabbing this info. While these were very informative, they were for one database at a time. I need this for all the databases so I can try to not only find out when each one ran, but also use these time stamps to figure out the duration.

The big recommendation I’d make with regard to this is not to use sp_msforeachdb.  Otherwise, click through for a good script.

Comments closed

SQL Server Backups On Azure VM

Rolf Tesmer shows us various options available for backing up SQL Server on Azure VMs:

Recently I had a requirement to collate and briefly compare some of the various methods to perform SQL Server backup for databases deployed onto Azure IaaS machines.  The purpose was to provide a few options to cater for the different types(OLTP, DW, etc) and sizes (small to big) of databases that could be deployed there.

Up front, I am NOT saying that these are the ONLY options to perform standard SQL backups!  I am sure there are others – however – the below are both supported and well documented – which when it comes to something as critical as backups is pretty important.

So the purpose of this blog is to provide a quick and brief list of various SQL backup methods!

Read on for the options.

Comments closed

A Definition Of Functional Programming

Kevin Sookocheff contrasts functional programming with its imperative cousin:

Functional programming is a form of declarative programming that expresses a computation directly as pure functional transformation of data. A functional program can be viewed as a declarative program where computations are specified as pure functions.

I think that if you’re a set-based SQL developer, functional programming languages will make the most intuitive sense.  They’re a bit harder to wrap your mind around if you’ve grown up as an imperative C-style developer, but are still worth the effort.

Comments closed

Building A Windows Failover Cluster

David Fowler continues his series on building a test lab:

Now I don’t want to get into the details of Quorum, there are plenty of great posts out there that explain it far better than I can but in a nutshell, each node in the cluster has a vote and we really want the total number of votes to be an odd number.  But we’ve only got two servers, does that mean that we need a create another server to make an odd number?  Well, no we don’t.  What we can use is what’s known as a file share witness, and that’s simply a file share that each of the nodes in the cluster can access.  That file share will effectively act as our third vote.

So first thing that you’re going to need to do is create a file share somewhere, the best place for that in our setup would be on the domain controller or somewhere that we know is always likely to be available.  So go and do that now, call it what you like but make sure that the servers are going to have full rights to it.  As this is just our own personal little test lab and we’re not too worried about best practices you could possibly open it up to EVERYONE (probably not a great idea in a production environment but not the end of the world if we want to be lazy in our own little play pen).

David also shows how to set up an Availability Group.

Comments closed

More Fun With SSMS

Wayne Sheffield continues his SSMS tools and tips series.  Since our last look, he’s added a few more tips.  First, Wayne shows how to show and hide blocks of T-SQL using Outlining.  Then, he gets to something I find useful in SSMS:

There are many editing items in SSMS that makes formatting and navigating your code easier than ever. Most of these Quick Editing Tips that follow are available from the Advanced submenu on the Edit menu

In particular, I like showing whitespace characters, as I’m kind of a whitespace tyrant.  But there are several other helpful options in that menu.

From there, Wayne shows how to use bookmarks in SSMS, which is something I tend not to do.  Finally, you can see the SSMS web browser.

Comments closed

Avoid Scalar Functions In Computed Columns

Daniel Hutmacher shows why you should not include scalar functions inside computed column definitions:

Scalar functions can be a real headache when you’re performance tuning. For one, they don’t parallelize. In fact, if you use a scalar function in a computed column, it will prevent any query that uses that table from going parallel – even if you don’t reference that column at all!

Read on for a demonstration.

Comments closed

Azure And The Kappa Architecture

Jared Zagelbaum describes the Kappa architecture and points out how there’s limited built-in support in Azure for it:

You can’t support kappa architecture using native cloud services. Cloud providers, including Azure, didn’t design streaming services with kappa in mind. The cost of running streams with TTL greater than 24 hours is more expensive, and generally, the max TTL tops out around 7 days. If you want to run kappa, you’re going to have to run Platform as a Service (PaaS) or Infrastructure as a Service (IaaS), which adds more administration to your architecture. So, what might this look like in Azure?

Read the whole thing.

Comments closed