Press "Enter" to skip to content

Curated SQL Posts

Sentiment Analysis

Dustin Ryan and Patrick Leblanc used Azure ML and Power BI to do sentiment analysis:

Using Azure ML and a free subscription to the Text Analytics API, I’m going to show you how to perform sentiment analysis and key phrase extraction on tweets with the hashtag #Colts (after this past Sunday’s 51-16 beat down of the Colts at the hands of the Jacksonville Jaguars, I’m bathing in the tears of Colts fans. Watch the highlights! ). Although my example here is somewhat humorous, the steps can be used to perform sentiment analysis and key phrase extraction on any text data as long as you can get the data into Power Query.

This is a fantastic example of how Azure ML can be used.  Read the whole thing.

Comments closed

Powershell To Modify Team Foundation Version Control

Kevin Eckart makes use of TFS commands to maintain source:

In our environment, changes made in the Test branch have to travel through the Main branch and into the Release branch to be deployed into production. Sometimes changes need to move through quickly without regard to other changes, especially in an environment where there may be a single coder. Note: the following code will merge all checked in code regardless of who checked it in. Be careful in multi-coder environments.

This is your daily public service announcement saying that if you don’t have your database code in source control, you really should get your database code into source control.

Comments closed

Incrementing All Sequences

Mark Broadbent had to increment all of his sequences by 10,000.  Here’s how he did it:

The only problem with this approach is that our database was configured (rightly or wrongly) with approximately 250 sequences! Since we could not be sure which sequences would ultimately cause us problems we decided to increment each one by 10,000.

Not being someone who likes performing monotonous tasks and also recognising the fact that this task would probably need to be performed again in the future I decided to attempt to programmatically solve this problem.

The script isn’t too difficult to understand but let me reiterate his warning:  read the script before you run it, and know exactly what it’s doing before you run it.

Comments closed

CISL

Niko Neugebauer talks about the Columnstore Indexed Scripts Library:

Around 3.5 Months ago in September of 2015, I have announced the first public release of the CISL – Columnstore Indexes Scripts Library, which allows to have a deeper insight into the database that uses or can use Columnstore Indexes.
Since that, I have released 4 more “point releases” with bug fixes and new features, I have greatly expanded the support of SQL Server with inclusion of SQL Server 2012, SQL Server 2016 and Azure SQLDatabase.

If you use columnstore indexes, you absolutely want to get this.  Also, there’s a brand new update out.

Comments closed

Finding Login Permissions

Andy Galbraith has a new permissions script:

I recently was tasked with this ticket:

Please add new login Domain\Bob to server MyServer.  Grant the login the same permissions as Domain\Mary.

On the face of it, this seems relatively straightforward, right?  It is the kind of request that we all get from time to time, whether as an ad-hoc task or as part of a larger project, such as a migration.

The catch of course is that it isn’t that easy – how do you know what permissions Mary has?

Andy’s script looks good.  For bonus points, compare it to fn_my_permissions.

Comments closed

Tuning SQL Server Backups

Derik Hammer has a post on tuning SQL Server backups:

Finally, do not forget about your memory. To backup or restore a database you have to load data pages into memory. We will talk more about memory below and how the internal buffer pool comes into play and can cause operating system paging or out of memory conditions.

Derik shows the various knobs and switches available, and I want to emphasize one thing:  optimizing backup statements involves testing different scenarios.  You can make good guesses as to the appropriate MAX_TRANSFER_SIZE or BUFFERCOUNT, but even then, test different combinations and find what works best for each database.

Comments closed

FOR JSON WITHOUT_ARRAY_WRAPPER

Jovan Popvic introduces us to the WITHOUT_ARRAY_WRAPPER clause:

This option enables you to remove square brackets [ and ] that surround JSON text generated by FOR JSON clause. I will use the following example:

SELECT 2015 as year, 12 as month, 15 as day
FOR JSON PATH, WITHOUT_ARRAY_WRAPPER

This query will return:

{ "year":2015, "month":12, "day":15 }

However, without this option, following text would be returned:

[{ "year":2015, "month":12, "day":15 }]

Jovan also points out important changes between 3.1 and 3.2 with FOR JSON output.

Comments closed

Restoring CDC-Enabled Databases

Mark Broadbent shows us how to restore databases with Change Data Capture enabled:

This automatically poses the question of how it is possible to restore a backup chain with CDC? On a database restore, in order to apply differential backups and transaction logs the NORECOVERY clause is required to prevent SQL Server from performing database recovery.

If this option is required but KEEP_CDC in conjunction with it is incompatible, surely this means point in time restores are not possible for restores that require CDC tables to be retained?

-Wrong!

The answer is a bit surprising, and my guess is that most database administrators are totally unaware of this restoration quirk.

Comments closed

Learning R

Grant Fritchey is learning R:

Awesome. Fixed that algorithm problem, right?

Wrong.

That’s because algorithms are not the problem… the only problem. The real problem is data preparation. A lot of the examples you’ll read online are very straight forward with nice neat data sets. That’s because they were carefully groomed and prepared. Here I am looking at the wooly wild real data and I’m utterly lost in how to properly prepare this so that it’s appropriately set up as a continuous distribution(or a distribution at all). WOOF! The reason this is so hard is because I actually don’t understand the data fundamentals of the problem I’m trying to solve in exactly the way needed to solve the problem. More cogitation is necessary.

Just because you can write R code doesn’t mean you are a data scientist.  Grant has the right mindset, but this post is fair warning that R’s complexity isn’t so much in its being a DSL, but rather in the domain itself.

Comments closed