Press "Enter" to skip to content

Author: Kevin Feasel

Finding Data Which Breaks Constraints

Phil Factor has a procedure to test disabled check constraints and find data which would cause an error:

However often I go on about CHECK constraints, there will always be a developer who will leave them out or mutter in a dignified manner about how all checks need to be done only at the application level. This attitude soon gets divine retribution. Bad data springs up like a rotting fungus over your database unless you add CHECK constraints to all your tables. This is fine but then how do you prevent the excellent and estimable habit of adding them to then interfere with a release? The constraints will stop the build if they meet bad data: it is what they are trained to do. If you don’t like that, then you must fix the bad data first.

Click through for the process.

Comments closed

Qutoed Data and OPENROWSET

Dave Mason wants to remove quoted identifiers from a flat file:

I haven’t shown all the columns, but you get the idea–every column in the result set has data enclosed in double quotes. That’s exactly how it appears in the source data file.

Dave has a method which works for plenty of versions of SQL Server. If you’re using 2017 or later, the FIELDQUOTE parameter was added to solve this problem, though to be fair, I haven’t actually tried it to see if it works as expected.

Comments closed

Tips for Using Azure Storage

James Serra takes us through Azure Data Lake Store Gen2 and Azure Blob Storage:

Azure Data Lake Store (ADLS) Gen2 should be used instead of Azure Blob Storage unless there is a needed feature that is not yet GA’d in ADLS Gen2.

The major features that are missing from ADLS Gen2 are premium tiersoft deletepage blobsappend blobs, and snapshots. The major features that are in preview are archive tierlifecycle management, and diagnostic logs. Check out all the missing features at Known issues with Azure Data Lake Storage Gen2.

Note that underneath the covers, ADLS Gen2 uses Azure Blob Storage and is simply a layer over blob storage providing additional features (i.e. hierarchical file system, better performance, enhanced security, Hadoop compatible access).

Click through for a bullet point list of useful information.

Comments closed

Querying SQL Server Replicas Under Load

Taryn Pratt points out that replicas of data can contain stale data:

Last week at Stack Overflow we had an internal hack-a-thon, or as we call it, a make-a-thon. I was on the bug-bashing team, which is the team that attempts to fix smallish bugs we haven’t gotten around to fixing, due to other time-constraints. I was tagged to investigate a bug about duplicate badges being awarded because it looked to possibly be an easy fix in SQL. At first glance it looked simple enough, but once I started digging in, I figured out very quickly it wouldn’t be.

It’s an interesting problem, but no solutions in the post. It’s a hard problem.

Comments closed

Databricks Automated Deployment and Testing

Li Yu, et al, explain how to use Databricks notebooks and MLflow to automate deployment and testing of Spark solutions:

Today many data science (DS) organizations are accelerating the agile analytics development process using Databricks notebooks.  Fully leveraging the distributed computing power of Apache Spark™, these organizations are able to interact easily with data at multi-terabytes scale, from exploration to fast prototype and all the way to productionize sophisticated machine learning (ML) models.  As fast iteration is achieved at high velocity, what has become increasingly evident is that it is non-trivial to manage the DS life cycle for efficiency, reproducibility, and high-quality. The challenge multiplies in large enterprises where data volume grows exponentially, the expectation of ROI is high on getting business value from data, and cross-functional collaborations are common.

In this blog, we introduce a joint work with Iterable that hardens the DS process with best practices from software development.  This approach automates building, testing, and deployment of DS workflow from inside Databricks notebooks and integrates fully with MLflow and Databricks CLI. It enables proper version control and comprehensive logging of important metrics, including functional and integration tests, model performance metrics, and data lineage. All of these are achieved without the need to maintain a separate build server.

Read on to see how.

Comments closed

Finding SSRS Log Files

Rudy Rodarte explains where you can find SQL Server Reporting Services log files:

After many years working with SQL Server Reporting Services, I’ve gotten myself into a few situations where I had to examine the SSRS log to see what was happening with data sources, subscriptions, and other, unique SSRS particulars. For a time, I thought that the regular SQL Server Error Log was enough. But experience has taught me that there is much more information inside of the SSRS log file than the SQL Server Agent and Service logs. But, you must know where to look to find the SSRS Log file.

Those logs often contain good information.

Comments closed

Managing Systems with Azure Arc

Robert Smit takes us through Azure Arc:

This Blog post is about Azure Arc, how to set this up and get you started with Azure Arc. For customers who want to simplify complex and distributed environments across on-premises, edge and multi cloud, Azure Arc enables deployment of Azure services anywhere and extends Azure management to any infrastructure.

So Azure Arc is not a replacement for the old Azure Server manager tools! So no remote RDP or open MMC only log analytics, policy’s, CLI etc. https://robertsmit.wordpress.com/2016/08/25/azure-server-management-tools-manage-your-servers-from-anywhere-servermgmt-azure-smt/

Click through for a demonstration.

Comments closed

Memory Optimizer Advisor

Monica Rathbun takes us through the Memory Optimization Advisor in SQL Server Management Studio:

Previously I wrote about In-Memory Optimized Tables, in this blog I am going to discuss how to determine which tables could benefit from being In-Memory by using a tool called Memory Optimization Advisor (MOA). This a is a tool built into SQL Server Management Studio (SSMS) that will inform you of which tables could benefit  using In Memory OLTP capabilities and which may have non supported features. Once identified, MOA will help you to actually migrate that table and data to be optimized. Let’s see how it works by walking through it using a table I use for demonstrations in AdventureWorks2016CTP3. Since this is a smaller table and doesn’t incur a ton writes it is not a good use case, however, for simplicity I am using it for this demo.

This is good for seeing which tables could easily move to memory-optimized and which you shouldn’t even try.

Comments closed

Transactions Rolling Back During CHECKDB

Paul Randal explains a long-standing bug in CHECKDB:

Continuing the database snapshot theme from the previous Curious Case post, I had another question from someone who was concerned about transactions rolling back during DBCC CHECKDB. They’d just noticed the messages in the error log saying that when DBCC CHECKDB was executed, it was causing transactions to roll back in the database – and how could that possibly be allowed to happen? They said they panicked and stopped all DBCC CHECKDB executions.

There’s no need to panic. The problem is actually a bug in the database snapshot code that’s been there since SQL Server 2005, where it reports the wrong database name.

Read on for a demo; it’s not a dangerous problem.

Comments closed