Author: Kevin Feasel

#pull out the animals which are dogs

animaldata[animaldata$Animal.Type == “Dog” ] # throuws an error

Error in `[.data.frame`(animaldata, animaldata$Animal.Type == “Dog”): undefined columns selected

Traceback:

1. animaldata[animaldata$Animal.Type == “Dog”]

2. `[.data.frame`(animaldata, animaldata$Animal.Type == “Dog”)

3. stop(“undefined columns selected”)

In [8]:

#fixed error

animaldata[animaldata$Animal.Type == “Dog”, ] # missedout comma with in the bracket

Some of it is basic syntax; others are a bit nastier.

Comments closed

Validating Database Mail

Published 2017-11-15 by Kevin Feasel

Frank Gill has a script to validate that your database mail settings are valid:

In my last post, I shared a script to automate the migration of SQL Server Database Mail settings. In this post, I show how to send test e-mails from all Database Mail profiles on an instance. The migration I was working on contained 21 Database Mail profiles. The following script will send a test e-mail from each profile to confirm successful configuration. I hope you can put this code to use in your migrations.

Click through for the script.

Comments closed

The GDPR And You

Published 2017-11-15 by Kevin Feasel

William Brewer has some Q&A regarding the General Data Protection Regulation:

The General Data Protection Regulation (GDPR) will affect organisations in countries around the world, not just those in Europe. The GDPR regulates how personal data is stored, moved, handled, and destroyed. Not following the regulation will lead to dire consequences for your organisation. As a data professional or developer, you may have many questions and might be wondering how it will affect the way you will do your job. William Brewer answers common questions about the GDPR that you were too shy to ask.

Grant Fritchey summarizes the rules:

Ever heard of the General Data Protection Regulation? If not, go and read the Wiki. I’ll wait.

I can already hear what you’re thinking. “Grant, this doesn’t apply to me because my company is in the <insert non-EU country here>.” How do I know you’re thinking that? Because every single person with whom I’ve brought this up has had the same response. You might want to go back and re-read it.

As a data professional, you’re going to want to know about this regulation.

Comments closed

Database Projects: Helping Find Obsolete References

Published 2017-11-15 by Kevin Feasel

Jan Mulkens explains some of those “unresolved reference” warnings in SQL Server Data Tools database projects:

if you’re developing databases in SSDT, like you should, you’re probably getting a lot of build warnings.
One of the warnings you’ll see the most often is the “unresolved reference”.
Usually you solve these by adding either the master, the msdb or some application database as a database reference.
This post is about a warning you might get when out of habit (or, if like me, you didn’t know any better yet) you’re using old system views like sys.sysprocesses. You expect it to work but it simply doesn’t…

Worth reading the whole thing, as well as keeping up-to-date with your DMV and system view usage.

Comments closed

Jitter In Power BI Charts

Published 2017-11-15 by Kevin Feasel

Rob Collie shows how to incorporate jitter in Power BI scatter charts:

Now, sometimes you may WANT multiple rows to combine into one dot, but in this particular case, I want to see each row of my source data as its own dot.

When adding a new calculated column, there are LOTS of ways to uniquely “stamp” each row with its own distinct value. I could do this in DAX, but it would require concatenating/combining enough columns together (in this case, probably [Game #], [Qtr], and [Time], since no two rows can “happen” at the same time in the same game.

But for other reasons that you will see shortly, I need the unique identifier to be a number, and I don’t want to go through the contortions of converting text values to numeric, plus as you can see, the data is incomplete in the [Time] column (lots of blanks).

There’s a lot here, and the end result is a great addition to your Power BI toolbelt. But as I’m reading Rob’s post, I’m thinking about how much easier it is to do some of this with ggplot2.

Comments closed

Finding The Right Batch Size For Bulk Loads

Published 2017-11-15 by Kevin Feasel

Dan Guzman has some bulk load batch size considerations:

Bulk load has long been the fastest way to mass insert rows into a SQL Server table, providing orders of magnitude better performance compared to traditional INSERTs. SQL Server database engine bulk load capabilities are leveraged by T-SQL BULK INSERT, INSERT…SELECT, and MERGE statements as well as by SQL Server client APIs like ODBC, OLE DB, ADO.NET, and JDBC. SQL Server tools like BCP and components like SSIS leverage these client APIs to optimize insert performance.

SQL Server 2016 and later improves performance further by turning on bulk load context and minimal logging by default when bulk loading into SIMPLE and BULK LOGGED recovery model databases, which previously required turning on trace flags as detailed in this blog post by Parikshit Savjani of the MSSQL Tiger team. That post also includes links to other great resources that thoroughly cover minimal logging and data loading performance, which I recommend you peruse if you use bulk load often. I won’t repeat all that information here but do want to call attention to the fact that these new bulk load optimizations can result in much more unused space when a small batch size is used compared to SQL Server 2014 and older versions.

Click through for some tips.

Comments closed

Searching Stored Procedures And Ad Hoc Queries

Published 2017-11-15 by Kevin Feasel

Bert Wagner has a couple queries to help you find references in T-SQL objects, as well as ad hoc statements which are currently in the plan cache:

Have you ever wanted to find something that was referenced in the body of a SQL query?

Maybe you need to know what queries you will have to modify for an upcoming table rename. Or maybe you want to see how many queries on your server are running SELECT *

Below are two templates you can use to search across the text of SQL queries on your server.

Click through for the scripts. Finding references in T-SQL objects (views, procedures, functions, triggers, etc.) is a fairly straightforward process. Finding references in ad hoc statements is much more hit-or-miss.

Comments closed

Love Your Bugs

Published 2017-11-14 by Kevin Feasel

Allison Kaptur has an appreciation for strange bugs:

I love this bug because it shows that bitflips are a real thing that can happen, not just a theoretical concern. In fact, there are some domains where they’re more common than others. One such domain is if you’re getting requests from users with low-end or old hardware, which is true for a lot of laptops running Dropbox. Another domain with lots of bitflips is outer space – there’s no atmosphere in space to protect your memory from energetic particles and radiation, so bitflips are pretty common.

You probably really care about correctness in space – your code might be keeping astronauts alive on the ISS, for example, but even if it’s not mission-critical, it’s hard to do software updates to space. If you really need your application to defend against bitflips, there are a variety of hardware & software approaches you can take, and there’s a very interesting talk by Katie Betchold about this.

This is an interesting approach, although I fear I may not have the temperament to love my bugs so much as despise their existence.

Comments closed

Curl For Windows Now Uses WinSSL

Published 2017-11-14 by Kevin Feasel

David Smith notes that curl version 3.0 supports the winSSL library rather than installing OpenSSL:

To implement secure communications, the curl package needs to connect with a library that handles the SSL (secure socket layer) encryption. On Linux and Macs, curl has always used the OpenSSL library, which is included on those systems. Windows doesn’t have this library (at least, outside of the Subsystem for Linux), so on Windows the curl package included the OpenSSL library and associated certificate. This raises its own set of issues (see the post linked below for details), so version 3.0 of the package instead uses the built-in winSSL library. This means curl uses the same security architecture as other connected applications on Windows.

This shouldn’t have any impact on your web-connectivity from R now or in the future, except the knowledge that the underlying architecture is more secure. Nonetheless, it’s possible to switch back to OpenSSL-based encryption (and this remains the default on Windows 7, which does not include the winSSL).

Click through for more information.

Comments closed

Scraping SQL Saturday Statistics

Published 2017-11-14 by Kevin Feasel

Tomaz Kastrun shows how to use rvest to read the SQL Saturday website and parse schedule details:

I wanted to check a simple query: How many times has a particular topic been presented and from how many different presenters.

Sounds interesting, tackling the problem should not be a problem, just that the end numbers may vary, since there will be some text analysis included.

Read on for the code and some analysis.

Comments closed

M	T	W	T	F	S	S
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30	31