Press "Enter" to skip to content

Day: February 8, 2022

Azure ML and the Python SDK in VS Code

I continue a series on getting beyond the basics with Azure ML. First up, we get up close and personal in development:

Notebooks are great for ad hoc work or simple data analysis but we will want more robust tools if we wish to perform proper code development, testing, and deployment. This is where Visual Studio Code comes into play, particularly the Azure Machine Learning extension.

Then, I get into the Python SDK:

Over the past two posts, we have started using the Azure Machine Learning SDK for Python but I’ve only touched on the topic. In this post, we are going to dive into the topic.

Read on for more info on each.

Comments closed

Data Compression and Caching

Paul Randal answers a question about caching of compressed values in SQL Server:

In the previous Curious Case post, I explained how to monitor how well page compression is working. That prompted a reader to ask me how to monitor the hit rate of lookups in the cache of decompressed column values. She was very surprised by my answer…

Read on for Paul’s answer. I’d say that even with the information Paul discloses, I don’t think I’ve ever seen page compression be a bad idea in any system I’ve worked with. There are specific tables in which it’s a bad idea but never have I seen a situation in which I could not use page compression at all due to a performance impact.

Comments closed

Time and Unit Tests

Michael J. Swart says, look at the time!:

A flaky test is a unit test that sometimes passes and sometimes fails. The causes of these flaky tests are often elusive because they’re not consistently reproducible.

I’ve found that unit tests that deal with dates and times are notorious for being flaky – especially such tests that talk to SQL Server. I want to explore some of the reasons this can happen.

As a quick note, if you’re using time in database unit tests, don’t use GETUTCDATE() or GETDATE() or any other function like that. It’s a non-deterministic function. Instead, use specific dates and times. That way, you can explicitly test for the types of things Michael points out.

Comments closed

Querying Stats Data with a DMF

Grant Fritchey wants queryable data:

We’ve always been able to look at statistics with DBCC SHOW_STATISTICS. You can even tell SHOW_STATISTICS to only give you the properties, STAT_HEADER, or histogram, HISTOGRAM. However, it’s always come back in a format that you can’t easily consume in T-SQL. From SQL Server 2012 to everything else, you can simply query sys.dm_db_stats_properties to get that same header information, but in a consumable fashion.

Read on for a quick post showing a couple of things you can do with the DMF.

Comments closed

Upgrade Strategies

Deepthi Goguri discusses upgrading:

When I started my first job as a DBA seven years ago, my project was to migrate several SQL Servers and all the servers were in SQL Server 2000. In my first SQL class at my school, I started my learning with SQL Server 2012. It was a shock to me to work on SQL 2000 databases at the time (as I am not familiar with the SQL Server 2000 yet), especially as it was my first job as a DBA.

My first project was to migrate approximately two hundred and fifty SQL 2000 SQL Servers to SQL Server 2012/2016. It took us a couple of years to successfully migrate all these Servers.

Deepthi mentions fear as a demotivating factor. In fairness, fear is a valid response to upgrades for two separate reasons: first, because the changes they release might break your existing code (something very common in the data science world); and second, because new code has new bugs that you haven’t discovered or worked around yet.

Comments closed