Press "Enter" to skip to content

Day: January 17, 2020

Databricks Automated Deployment and Testing

Li Yu, et al, explain how to use Databricks notebooks and MLflow to automate deployment and testing of Spark solutions:

Today many data science (DS) organizations are accelerating the agile analytics development process using Databricks notebooks.  Fully leveraging the distributed computing power of Apache Spark™, these organizations are able to interact easily with data at multi-terabytes scale, from exploration to fast prototype and all the way to productionize sophisticated machine learning (ML) models.  As fast iteration is achieved at high velocity, what has become increasingly evident is that it is non-trivial to manage the DS life cycle for efficiency, reproducibility, and high-quality. The challenge multiplies in large enterprises where data volume grows exponentially, the expectation of ROI is high on getting business value from data, and cross-functional collaborations are common.

In this blog, we introduce a joint work with Iterable that hardens the DS process with best practices from software development.  This approach automates building, testing, and deployment of DS workflow from inside Databricks notebooks and integrates fully with MLflow and Databricks CLI. It enables proper version control and comprehensive logging of important metrics, including functional and integration tests, model performance metrics, and data lineage. All of these are achieved without the need to maintain a separate build server.

Read on to see how.

Comments closed

Querying SQL Server Replicas Under Load

Taryn Pratt points out that replicas of data can contain stale data:

Last week at Stack Overflow we had an internal hack-a-thon, or as we call it, a make-a-thon. I was on the bug-bashing team, which is the team that attempts to fix smallish bugs we haven’t gotten around to fixing, due to other time-constraints. I was tagged to investigate a bug about duplicate badges being awarded because it looked to possibly be an easy fix in SQL. At first glance it looked simple enough, but once I started digging in, I figured out very quickly it wouldn’t be.

It’s an interesting problem, but no solutions in the post. It’s a hard problem.

Comments closed

Managing Systems with Azure Arc

Robert Smit takes us through Azure Arc:

This Blog post is about Azure Arc, how to set this up and get you started with Azure Arc. For customers who want to simplify complex and distributed environments across on-premises, edge and multi cloud, Azure Arc enables deployment of Azure services anywhere and extends Azure management to any infrastructure.

So Azure Arc is not a replacement for the old Azure Server manager tools! So no remote RDP or open MMC only log analytics, policy’s, CLI etc. https://robertsmit.wordpress.com/2016/08/25/azure-server-management-tools-manage-your-servers-from-anywhere-servermgmt-azure-smt/

Click through for a demonstration.

Comments closed

Finding SSRS Log Files

Rudy Rodarte explains where you can find SQL Server Reporting Services log files:

After many years working with SQL Server Reporting Services, I’ve gotten myself into a few situations where I had to examine the SSRS log to see what was happening with data sources, subscriptions, and other, unique SSRS particulars. For a time, I thought that the regular SQL Server Error Log was enough. But experience has taught me that there is much more information inside of the SSRS log file than the SQL Server Agent and Service logs. But, you must know where to look to find the SSRS Log file.

Those logs often contain good information.

Comments closed

Transactions Rolling Back During CHECKDB

Paul Randal explains a long-standing bug in CHECKDB:

Continuing the database snapshot theme from the previous Curious Case post, I had another question from someone who was concerned about transactions rolling back during DBCC CHECKDB. They’d just noticed the messages in the error log saying that when DBCC CHECKDB was executed, it was causing transactions to roll back in the database – and how could that possibly be allowed to happen? They said they panicked and stopped all DBCC CHECKDB executions.

There’s no need to panic. The problem is actually a bug in the database snapshot code that’s been there since SQL Server 2005, where it reports the wrong database name.

Read on for a demo; it’s not a dangerous problem.

Comments closed

Memory Optimizer Advisor

Monica Rathbun takes us through the Memory Optimization Advisor in SQL Server Management Studio:

Previously I wrote about In-Memory Optimized Tables, in this blog I am going to discuss how to determine which tables could benefit from being In-Memory by using a tool called Memory Optimization Advisor (MOA). This a is a tool built into SQL Server Management Studio (SSMS) that will inform you of which tables could benefit  using In Memory OLTP capabilities and which may have non supported features. Once identified, MOA will help you to actually migrate that table and data to be optimized. Let’s see how it works by walking through it using a table I use for demonstrations in AdventureWorks2016CTP3. Since this is a smaller table and doesn’t incur a ton writes it is not a good use case, however, for simplicity I am using it for this demo.

This is good for seeing which tables could easily move to memory-optimized and which you shouldn’t even try.

Comments closed