Kevin Feasel – Page 561

Azure ML and the Python SDK in VS Code

Published 2022-02-08 by Kevin Feasel

I continue a series on getting beyond the basics with Azure ML. First up, we get up close and personal in development:

Notebooks are great for ad hoc work or simple data analysis but we will want more robust tools if we wish to perform proper code development, testing, and deployment. This is where Visual Studio Code comes into play, particularly the Azure Machine Learning extension.

Then, I get into the Python SDK:

Over the past two posts, we have started using the Azure Machine Learning SDK for Python but I’ve only touched on the topic. In this post, we are going to dive into the topic.

Read on for more info on each.

Comments closed

Unique Clustered Indexes and Included Columns

Published 2022-02-08 by Kevin Feasel

Greg Dodd gives it away in the title:

I needed to add a Unique Constraint today to a table. We could go ahead and just add a constraint, but the data I wanted to constrain on was already indexed. Could I just make that a Unique Index and be done with it? Let’s find out with the following table:

Let’s, shall we?

Comments closed

Time and Unit Tests

Published 2022-02-08 by Kevin Feasel

Michael J. Swart says, look at the time!:

A flaky test is a unit test that sometimes passes and sometimes fails. The causes of these flaky tests are often elusive because they’re not consistently reproducible.
I’ve found that unit tests that deal with dates and times are notorious for being flaky – especially such tests that talk to SQL Server. I want to explore some of the reasons this can happen.

As a quick note, if you’re using time in database unit tests, don’t use GETUTCDATE() or GETDATE() or any other function like that. It’s a non-deterministic function. Instead, use specific dates and times. That way, you can explicitly test for the types of things Michael points out.

Comments closed

Upgrade Strategies

Published 2022-02-08 by Kevin Feasel

Deepthi Goguri discusses upgrading:

When I started my first job as a DBA seven years ago, my project was to migrate several SQL Servers and all the servers were in SQL Server 2000. In my first SQL class at my school, I started my learning with SQL Server 2012. It was a shock to me to work on SQL 2000 databases at the time (as I am not familiar with the SQL Server 2000 yet), especially as it was my first job as a DBA.
My first project was to migrate approximately two hundred and fifty SQL 2000 SQL Servers to SQL Server 2012/2016. It took us a couple of years to successfully migrate all these Servers.

Deepthi mentions fear as a demotivating factor. In fairness, fear is a valid response to upgrades for two separate reasons: first, because the changes they release might break your existing code (something very common in the data science world); and second, because new code has new bugs that you haven’t discovered or worked around yet.

Comments closed

Querying Stats Data with a DMF

Published 2022-02-08 by Kevin Feasel

Grant Fritchey wants queryable data:

We’ve always been able to look at statistics with DBCC SHOW_STATISTICS. You can even tell SHOW_STATISTICS to only give you the properties, STAT_HEADER, or histogram, HISTOGRAM. However, it’s always come back in a format that you can’t easily consume in T-SQL. From SQL Server 2012 to everything else, you can simply query sys.dm_db_stats_properties to get that same header information, but in a consumable fashion.

Read on for a quick post showing a couple of things you can do with the DMF.

Comments closed

Connection Tips on Working with Big Data Clusters

Published 2022-02-08 by Kevin Feasel

Bob Dorr shows how to connect to SQL Server Big Data Clusters:

This blog post focuses on connecting to the SQL Server BDC, some helpful log files and utility outputs. Understanding a few basics about a SQL Server BDC and the Contained Availability Group (containedag) makes managing and troubleshooting easier.

Read on for tips and tricks from Bob.

Comments closed

Enabling the MySQL General Query Log

Published 2022-02-08 by Kevin Feasel

Chad Callihan wants to troubleshoot some queries:

We have Extended Events and Profiler in SQL Server for tracking database activity. What if we want to track queries in MySQL? Let’s take a look at a few methods to do just that.

Read on to see what options are available in MySQL. If you’re coming at MySQL from a SQL Server background, you won’t have exactly the same tools available to you.

Comments closed

Building a Simple Streamlit App

Published 2022-02-07 by Kevin Feasel

I jump into a new web framework:

In the course of working on my book, I wanted to build an easy-to-use website for outlier detection. The idea here is that I have a REST API to perform the outlier detection work but I’d like something a little easier to read than JSON blobs coming out of Postman. That’s where Streamlit comes into play.

Click through to see how it all works. I was impressed with how easy it was to build a decent interactive website.

Comments closed

Building a Calendar Table in Spark

Published 2022-02-07 by Kevin Feasel

Chris Koester brings calendar tables to Spark:

With the Data Lakehouse architecture shifting data warehouse workloads to the data lake, the ability to generate a calendar dimension (AKA date dimension) in Spark has become increasingly important. Thankfully, this task is made easy with PySpark and Spark SQL. Let’s dive right into the code!

Read on to see how you can create one.

Comments closed

Making a Scatter Plot in Excel

Published 2022-02-07 by Kevin Feasel

Mike Cisneros shows how to create a nice-looking scatter plot in Excel:

Scatter plots are excellent charts for showing a relationship between two numerical variables across a number of unique observations. We see them in business communications from time to time, although they’re much more commonly used in the “exploration” part of the process—when we’re still trying to understand our data and find the important insights.
If you’re unfamiliar with scatter plots, their common use cases, or their benefits and drawbacks in a range of scenarios, check out the what is a scatter plot? article in our SWD Chart Guide. There, we explore some of the basics of scatter plots via an example, share tips for designing them more effectively, and discuss common variations (bubble charts, connected scatter plots, and more).

Read on for the process, which can be a lot more difficult than you may first expect.

Comments closed

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31

Author: Kevin Feasel