Press "Enter" to skip to content

Curated SQL Posts

SQL Server ML Services

SQL Server R Services is now SQL Server Machine Learning Services and supports Python.  First, Nagesh Pabbisetty and Sumit Kumar talk about Python support:

The addition of Python builds on the foundation laid for R Services in SQL Server 2016 and extends that mechanism to include Python support for in-database analytics and machine learning. We are renaming R Services to Machine Learning Services, and R and Python are two options under this feature.

The Python integration in SQL Server provides several advantages:

  • Elimination of data movement: You no longer need to move data from the database to your Python application or model. Instead, you can build Python applications in the database. This eliminates barriers of security, compliance, governance, integrity, and a host of similar issues related to moving vast amounts of data around. This new capability brings Python to the data and runs code inside secure SQL Server using the proven extensibility mechanism built in SQL Server 2016.

  • Easy deployment: Once you have the Python model ready, deploying it in production is now as easy as embedding it in a T-SQL script, and then any SQL client application can take advantage of Python-based models and intelligence by a simple stored procedure call.

  • Enterprise-grade performance and scale: You can use SQL Server’s advanced capabilities like in-memory table and column store indexes with the high-performance scalable APIs in RevoScalePy package. RevoScalePy is modeled after RevoScaleR package in SQL Server R Services. Using these with the latest innovations in the open source Python world allows you to bring unparalleled selection, performance, and scale to your SQL Python applications.

  • Rich extensibility: You can install and run any of the latest open source Python packages in SQL Server to build deep learning and AI applications on huge amounts of data in SQL Server. Installing a Python package in SQL Server is as simple as installing a Python package on your local machine.

  • Wide availability at no additional costs: Python integration is available in all editions of SQL Server 2017, including the Express edition.

Nagesh Pabbisetty also announces Microsoft R Server 9.1:

We took the first step with Microsoft R Server 9.0, and this follow on release includes significant innovations such as:

  • New machine learning enhancements and inclusion of pre-trained cognitive models such as sentiment analysis & image featurizers

  • SQL Server Machine Learning Services with integrated Python in Preview

  • Enterprise grade operationalization with real-time scoring and dynamic scaling of VMs

  • Deep customer & ISV partnerships to deliver the right solutions to customers

  • A panoply of sources to help you get started with ease

And Joseph Sirosh indicates that AI is where the money is:

So today it’s my pleasure to announce the first RDBMS with built-in AIa production-quality Community Technology Preview (CTP 2.0) of SQL Server 2017. In this preview release, we are introducing in-database support for a rich library of machine learning functions, and now for the first time Python support (in addition to R). SQL Server can also leverage NVIDIA GPU-accelerated computing through the Python/R interface to power even the most intensive deep-learning jobs on images, text, and other unstructured data. Developers can implement NVIDIA GPU-accelerated analytics and very sophisticated AI directly in the database server as stored procedures and gain orders of magnitude higher throughput. In addition, developers can use all the rich features of the database management system for concurrency, high-availability, encryption, security, and compliance to build and deploy robust enterprise-grade AI applications.

There’s a lot to digest here.

Comments closed

SQL Server 2017 CTP 2.0

The SQL Server team announces CTP 2.0 of SQL Server 2017:

Microsoft is excited to announce a new preview for the next version of SQL Server!  We disclosed a name for this next release, SQL Server 2017, today at the Microsoft Data Amp event. Community Technology Preview (CTP) 2.0 is the first production-quality preview of SQL Server 2017, and it is available on both Windows and Linux.  In this preview, we added a number of new capabilities, including the ability to run advanced analytics using Python in a parallelized and highly scalable way, the ability to store and analyze graph data, and other capabilities that help you manage SQL Server for high performance and uptime, including the Adaptive Query Processing family of intelligent database features and resumable online indexing.

I can finally call it “SQL Server 2017” instead of “SQL Server vNext.”  I don’t know why there was such a hubbub about the name 2017, but there you go.  Anyhow, I’ve grabbed the CTP and am raring to go.

Comments closed

Temporal Tables For R Source Control

Tomaz Kastrun shares an unorthodox way of collecting historical R code changes:

I will not comment on the solution Bob provided, since I don’t know how their infrastructure, roles, security is set up. At this point, I am grateful for his comment. But what I will comment, is that there is no straightforward way or any out-of-the-box solution. Furthermore, if your R code requires any additional packages, storing the packages with your R code is not that bad idea, regardless of traffic or disk overhead. And versioning the R code is something that is for sure needed.

To continue from previous post, getting or capturing R code, once it gets to Launchpad, is tricky. So storing R code it in a database table or on file system seems a better idea.

It’s an interesting concept.  My preference is to use R Tools for Visual Studio and a more traditional source control mechanism.  It involves keeping source control up to date, but that’s a good practice to follow in any case.

Comments closed

Parameters In rmarkdown Reports

Steph Locke shows how to use table parameters in rmarkdown reports:

The recent(ish) advent of parameters in rmarkdown reports is pretty nifty but there’s a little bit of behaviour that can come in handy but doesn’t come across in the documentation. You can use table parameters for rmarkdown reports.

Previously, if you wanted to produce multiple reports based off a dataset, you would make the dataset available and then perform filtering in the report. Now we can pass the filtered data directly to the report, which keeps all the filtering logic in one place.

It’s actually super simple to add table parameters for rmarkdown reports.

Click through to see the script.  As promised, it is in fact easy to do.

Comments closed

Samba On Linux Mint

Mark Broadbent shows how to set up SMB shares in Linux:

This quick guide is specifically targetted to the Linux Mint distribution (although will be applicable to many others) and only describes how to share your Linux filesystem folders and does not go into any detail regarding the advanced Samba functionality.

Even though Linux Mint attempts to make folder sharing more user-friendly, I have never had any success using the GUI based procedure, and have even struggled with the following method described in this article. Furthermore, I prefer to understand what is being configured behind the scenes, so I shall keep to the point and keep it simple.

Very useful post, given the cross-platform move Microsoft is making.

Comments closed

What Is The Data Platform?

Rolf Tesmer has weighed in with his thoughts on the “Data Platform”:

What this has meant is that innovation – in particular in the Azure Public Cloud, ISV’s, new data services/products, and new data related infrastructure – has accelerated dramatically and changed the very definitions of what was previously accepted as comprising the “Data Platform”.

Nowadays when I talk to customers about the “Data Platform” it encompasses a range of services across a mix of IaaS, PaaS and SaaS.  The decision of which data service to deploy now comes down to matching the business case technical requirements with the capability of a purpose built cloud service – as opposed to (in the past) trying to fit an obvious NoSQL use case into a traditional RDBMS platform.

I now see the “New Data Platform” as much broader than ever before and includes many other “non-traditional” data services…

Cf. Eugene Meidinger (who started this) and me (who exacerbated this).  This is an area ripe for consideration.

1 Comment

Stop Being Your Own Worst Enemy With Code

Bert Wagner has advice on making code understandable for future-you:

At the time I wrote it, I probably thought my code was beautiful. An elegant masterpiece. It should have been printed, framed, and hung on a wall of The Programming Hall of Fame. As clever as I thought I may have been a few years ago, I rarely am able to read my old code without some serious time wasted debugging.

This problem plagued me regularly. I tried different techniques to try and make my code easier to understand.

Bert has some good thoughts here, and I’ll add two small bits.  First, there’s a saying that it takes more mental effort to debug code than it takes to write it, so if you’re writing code at the edge of your understanding, effective debugging becomes difficult to impossible.  Second, unless you see a business rule frequently enough to internalize it, your greatest familiarity with the “whys” of the system is right when you are developing.  There is huge value in taking the time to document the rules in an accessible manner; even if you wrote the code, you probably won’t remember that weird edge case at 4 AM six months from now, when you need to remember it the most.

Comments closed

Considerations For Reducing I/O Costs

Monica Rathbun gives a few methods for reducing how many I/O operations a query requires:

Implicit Conversions

Implicit conversions often happen when a query is comparing two or more columns with different data types. In the below example, the system is having to perform extra I/O in order to compare a varchar(max) column to an nvarchar(4000) column, which leads to an implicit conversion, and ultimately a scan instead of a seek. By fixing the tables to have matching data types, or simply converting this value before evaluation, you can greatly reduce I/O and improve cardinality (the estimated rows the optimizer should expect).

There’s some good advice here if your main hardware constraint is being I/O bound.

Comments closed

Dark Queries

Michael Swart helps root out queries which get recompiled frequently and so won’t be in the cache:

Some of my favorite monitoring solutions rely on the cached queries:

but some queries will fall out of cache or don’t ever make it into cache. Those are the dark queries I’m interested in today. Today let’s look at query recompiles to shed light on some of those dark queries that maybe we’re not measuring.

By the way, if you’re using SQL Server 2016’s query store then this post isn’t for you because Query Store is awesome. Query Store doesn’t rely on the cache. It captures all activity and stores queries separately – Truth in advertising!

Click through for an Extended Event session which looks for recompilation.

Comments closed

Resetting SQL Administrators

Chris Lumnah shows how to use dbatools to reset a SQL authenticated administrative account:

As I was going through my environment, I realized I created a new domain controller for my tests. This DC has a new name and domain name which is different from my other VMs. I quickly realized that this will cause me issues later with authentication. No worries. I will just boot up the VMs and then and join them to the new domain. Easy-peasy. Now let met go test out my SQL Servers.

DOH!!

I received a login failure with access is denied. Using Windows Authentication with my new domain and recently joined server is not working. Why?…..Oh right, my new user id does not have access to SQL Server itself. As I sit there smacking myself in the head, I am also thinking about the amount of time it will take me to rebuild those VMs. Then it hit me!!!

Read on to see the solution, including a Powershell one-liner showing how it’s done.

Comments closed